Fix Dirichlet rand overflows #1702 #1886

quildtide · 2024-08-16T01:12:05Z

Core Issues

The rand(d::Dirichlet) calls Gamma(d.α[i]) i times and writes to x.

It then rescales this result by inv(sum(x)). When this overflows to Inf, we run into our 2 failure modes:

When all x_i == 0, we get Inf * 0 = NaN
When some x_i != 0, but are all deeply subnormal enough that inv(sum(x)) still overflows. We get some Inf values as a result.

For case 2, on Julia 1.11.0-rc1 on Windows, for example:

julia> rand(Xoshiro(123322), Dirichlet([4.5e-5, 4.5e-5, 8e-5]))
3-element Vector{Float64}:
  Inf
  Inf
 NaN

Fixing Case 1

If case 1 is happening, the best thing possible from a runtime perspective is probably to just choose a random x from a categorical distribution with the same mean. This is the limit behavior of the Dirichlet distribution, and my logic on why it's "safe enough" is:

If all-zeros are a rare occurance, this has little impact on the end sample
If all-zeros are common, rejecting samples and pulling another will probably yield a near-infinite reject loop. On the other hand, we're close enough to the limit behavior that floating point arithmetic errors are probably hurting us more than adopting the limit behavior.
While this should theoretically result in incorrect variance, testing shows that variance is within reasonable tolerance (0.01) of the real value.

There is another option where we could try rejecting all-0 samples until a certain maximum amount of samples before failing, but I think this is probably a waste of time for little gain in accuracy.

Fixing Case 2

We rescale all values by multiplying them by floatmax(), so inv doesn't overflow. This should work consistently for all float types where floatmax() * nextfloat() > floatmin() by at least ~1 magnitudes, which I think should be true for any non-exotic float types. I originally thought it would be enough to just set the largest value to 1, but it's actually possible to currently pull multiple subnormal values pre-normalization, and the method I adopted maintains the ratio between them.

Currently:

julia> rand(Xoshiro(123322), Dirichlet([4.5e-5, 4.5e-5, 8e-5]))
3-element Vector{Float64}:
  Inf
  Inf
 NaN

After this patch:

julia> rand(Xoshiro(123322), Dirichlet([4.5e-5, 4.5e-5, 8e-5]))
3-element Vector{Float64}:
  0.625061099164708
  0.37493890083529186
  0.0

Subnormal Parameters

While testing, I realized that my original fix for case 1 would break when all of the parameters themselves were deeply subnormal, e.g. Dirichlet([5e-321, 1e-321, 4e-321]). Given that the Dirichlet distribution is decently common in things like Bayesian inference, I thought it would be worth attempting to support these cases too.

Note that mean, var, etc. currently break on these deeply subnormally-parameterized distributions, but fixing that felt out of scope to this pull request. Fixing mean would be simple, but it could potentially be rather chunky. I am less sure about var and others.

codecov-commenter · 2024-08-16T01:24:41Z

Codecov Report

Attention: Patch coverage is 83.33333% with 11 lines in your changes missing coverage. Please review.

Project coverage is 85.96%. Comparing base (b219803) to head (0bd5b5c).

Files with missing lines	Patch %	Lines
src/samplers/expgamma.jl	72.50%	11 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1886      +/-   ##
==========================================
- Coverage   85.99%   85.96%   -0.03%     
==========================================
  Files         144      145       +1     
  Lines        8666     8726      +60     
==========================================
+ Hits         7452     7501      +49     
- Misses       1214     1225      +11

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ararslan · 2024-08-16T22:45:43Z

src/multivariate/dirichlet.jl

+function _rand_handle_overflow!(
+ rng::AbstractRNG,
+ d::Union{Dirichlet,DirichletCanon},
+ x::AbstractVector{<:Real}
+ )


This makes the style consistent with the surrounding code:

Suggested change

function _rand_handle_overflow!(

rng::AbstractRNG,

d::Union{Dirichlet,DirichletCanon},

x::AbstractVector{<:Real}

)

function _rand_handle_overflow!(rng::AbstractRNG,

d::Union{Dirichlet,DirichletCanon},

x::AbstractVector{<:Real})

src/multivariate/dirichlet.jl

devmotion · 2024-08-17T00:28:25Z

Instead of dealing with subnormals, at least for the example here sampling in log space would be sufficient (see also #1003 (comment), #1003 (comment), and #1810). For instance, with an ExpGamma version of the Marsaglia sampler I get:

julia> using Distributions, LogExpFunctions, Random

julia> using Distributions: GammaMTSampler

julia> # Inverse Power sampler in log-space (exp-gamma distribution)
       # uses the x*u^(1/a) trick from Marsaglia and Tsang (2000) for when shape < 1
       struct ExpGammaIPSampler{S<:Sampleable{Univariate,Continuous},T<:Real} <: Sampleable{Univariate,Continuous}
           s::S #sampler for Gamma(1+shape,scale)
           nia::T #-1/scale
       end

julia> ExpGammaIPSampler(d::Gamma) = ExpGammaIPSampler(d, GammaMTSampler)

julia> function ExpGammaIPSampler(d::Gamma, ::Type{S}) where {S<:Sampleable}
           shape_d = shape(d)
           sampler = S(Gamma{partype(d)}(1 + shape_d, scale(d)))
           return ExpGammaIPSampler(sampler, -inv(shape_d))
       end

julia> function rand(rng::AbstractRNG, s::ExpGammaIPSampler)
           x = log(rand(rng, s.s))
           e = randexp(rng)
           return muladd(s.nia, e, x)
       end

julia> function myrand!(rng::AbstractRNG, d::Dirichlet, x::AbstractVector{<:Real})
           for (i, αi) in zip(eachindex(x), d.alpha)
               @inbounds x[i] = rand(rng, ExpGammaIPSampler(Gamma(αi)))
           end
           return softmax!(x)
       end

julia> myrand!(Xoshiro(123322), Dirichlet([4.5e-5, 4.5e-5, 8e-5]), zeros(3))
3-element Vector{Float64}:
 0.6250610991638559
 0.37493890083615117
 0.0

quildtide · 2024-08-17T02:48:27Z

For instance, with an ExpGamma version of the Marsaglia sampler I get:

Okay, after doing some testing, this implementation seems to be superior to what I was doing until sum(alpha) itself is subnormal enough.

With your example implementation:

julia> myrand!(Random.default_rng(), Dirichlet([6e-309, 5e-309, 5e-309]), zeros(3))
3-element Vector{Float64}:
 1.0
 0.0
 0.0

julia> myrand!(Random.default_rng(), Dirichlet([5e-309, 5e-309, 5e-309]), zeros(3))
3-element Vector{Float64}:
 NaN
 NaN
 NaN

I brought in the code snippet from #1810 and that worked for a bit longer:

julia> function myrand2!(rng::AbstractRNG, d::Dirichlet, x::AbstractVector{<:Real})
                  for (i, αi) in zip(eachindex(x), d.alpha)
                      @inbounds x[i] = randlogGamma(αi)
                  end
                  return softmax!(x)
           end
julia> myrand2!(Random.default_rng(), Dirichlet([5e-310, 5e-310, 5e-310]), zeros(3))
3-element Vector{Float64}:
 0.0
 1.0
 0.0

julia> myrand2!(Random.default_rng(), Dirichlet([5e-311, 5e-311, 5e-311]), zeros(3))
3-element Vector{Float64}:
 NaN
 NaN
 NaN

The good news though is that there's only 1 failure mode now: when rand(ExpGamma) == -Inf. I'll maintain an edge case check to go into the Categorical sampler failure mode.

Co-Authored-By: David Widmann <[email protected]>

Co-Authored-By: chelate <[email protected]>

…t_rand_nan_inf

quildtide · 2024-09-04T07:51:28Z

@devmotion So this pull request's scope has gotten larger in a strange way.

New Summary of changes:

Implement ExpGammaIPSampler (based off of your code above)
Implement ExpGammaSSSampler (based off of random log-gamma for taming underflow issues #1810, with some improvements)
Implement _logsampler, _logrand, and _logrand! on Gamma for these
Dirichlet rand now has the following cases:
- If any alpha are > 0.5, do what we were doing before
  - I also tried to set this cutoff at 1, but this caused multiple DirichletMultinomial tests to error for reasons I do not yet have an explanation for.
- Else, try to sample via _logrand
  - This dispatches to ExpGammaIPSampler for alpha > 0.3
  - Else dispatches to ExpGammaSSSampler
- If even these fail (all -Inf), use Categorical limit behavior fallback

What this doesn't do:

Document or export ExpGammaIPSampler, ExpGammaSSSampler, or any of the _log sampling methods

This may seem a bit backwards, but I think that can be saved for another pull request later. The goal here is to close #1702.

quildtide added 2 commits August 15, 2024 20:20

Fix Dirichlet rand overflows JuliaStats#1702

5332f3a

Refactor code to reduce duplication

1b39c0c

Remove test that requires Xoshiro

d1baaf4

ararslan reviewed Aug 16, 2024

View reviewed changes

quildtide marked this pull request as draft August 17, 2024 05:34

quildtide and others added 5 commits September 4, 2024 00:41

Implement ExpGammaIPSampler

735324b

Co-Authored-By: David Widmann <[email protected]>

Implement ExpGammaSSSampler

1ae6210

Co-Authored-By: chelate <[email protected]>

Implement improved Dirichlet rand

06d8172

Merge remote-tracking branch 'upstream/master' into fix_1702_dirichle…

09842e3

…t_rand_nan_inf

Apply JuliaStats#1885 type change to ExpGammaIPSampler

f50e8c8

quildtide marked this pull request as ready for review September 4, 2024 06:45

Lower non-log threshold

0bd5b5c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Dirichlet rand overflows #1702 #1886

Fix Dirichlet rand overflows #1702 #1886

quildtide commented Aug 16, 2024

codecov-commenter commented Aug 16, 2024 •

edited

Loading

ararslan Aug 16, 2024

devmotion commented Aug 17, 2024 •

edited

Loading

quildtide commented Aug 17, 2024

quildtide commented Sep 4, 2024 •

edited

Loading

Fix Dirichlet rand overflows #1702 #1886

Are you sure you want to change the base?

Fix Dirichlet rand overflows #1702 #1886

Conversation

quildtide commented Aug 16, 2024

Core Issues

Fixing Case 1

Fixing Case 2

Subnormal Parameters

codecov-commenter commented Aug 16, 2024 • edited Loading

Codecov Report

ararslan Aug 16, 2024

Choose a reason for hiding this comment

devmotion commented Aug 17, 2024 • edited Loading

quildtide commented Aug 17, 2024

quildtide commented Sep 4, 2024 • edited Loading

codecov-commenter commented Aug 16, 2024 •

edited

Loading

devmotion commented Aug 17, 2024 •

edited

Loading

quildtide commented Sep 4, 2024 •

edited

Loading