Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Wrapped distribution wrapper #1724

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/Distributions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ export
WalleniusNoncentralHypergeometric,
Weibull,
Wishart,
Wrapped,
ZeroMeanIsoNormal,
ZeroMeanIsoNormalCanon,
ZeroMeanDiagNormal,
Expand Down Expand Up @@ -262,6 +263,7 @@ export
truncated, # truncate a distribution with a lower and upper bound
var, # variance of distribution
varlogx, # variance of log(x)
wrapped, # wrap a distribution around a bounded interval
expected_logdet, # expected logarithm of random matrix determinant
gradlogpdf, # gradient (or derivative) of logpdf(d,x) wrt x

Expand Down Expand Up @@ -298,6 +300,7 @@ include("product.jl")
include("reshaped.jl")
include("truncate.jl")
include("censored.jl")
include("wrapped.jl")
include("conversion.jl")
include("convolution.jl")
include("qq.jl")
Expand Down Expand Up @@ -360,7 +363,7 @@ Supported distributions:
QQPair, Rayleigh, Rician, Skellam, Soliton, StudentizedRange, SymTriangularDist, TDist, TriangularDist,
Triweight, Truncated, TruncatedNormal, Uniform, UnivariateGMM,
VonMises, VonMisesFisher, WalleniusNoncentralHypergeometric, Weibull,
Wishart, ZeroMeanIsoNormal, ZeroMeanIsoNormalCanon,
Wishart, Wrapped, ZeroMeanIsoNormal, ZeroMeanIsoNormalCanon,
ZeroMeanDiagNormal, ZeroMeanDiagNormalCanon, ZeroMeanFullNormal,
ZeroMeanFullNormalCanon

Expand Down
317 changes: 317 additions & 0 deletions src/wrapped.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,317 @@
using Distributions

"""
wrapped(d0::UnivariateDistribution, l::Real, u::Real; k::Int=1, kwargs...)

Construct a ``k``-wrapped distribution with support ``[l, u)``.

The resulting density wraps around the interval. With the period ``\\tau = u - l``, the
density can be expressed as two infinite series in terms of either the density ``f_0`` of
the unwrapped distribution `d0` or in terms of its characteristic function [`cf`](@ref)
``\\phi_0``:
````math
\\begin{aligned}
f(x) &= \\frac{1}{\\tau}\\sum_{j=-\\infty}^\\infty f_0\\left(x + \\frac{j}{k}\\tau\\right)\\
&= \\frac{1}{\\tau}\\sum_{j=-\\infty}^\\infty
\\phi_0\\left(jk\\frac{2\\pi}{\\tau}\\right)
\\exp\\left(-ijk\\frac{2\\pi}{\\tau}x\\right)
\\end{aligned}
````
If `k>1`, then the resulting distribution has ``k``-fold symmetry in the interval.

# Keywords
- `characteristic::Bool`: whether to use the characteristic function to compute the density.
If `false`, the density is computed using the logpdf of the unwrapped distribution.
If not specified, the `wrapped` method called may choose based on the parameters of
`l`, `u`, `tol`, and `d0`.
- `tol`: target absolute error of `pdf`. May be used to determine whether to use the
characteristic function in computations or not.

# Examples

"""
function wrapped(d::UnivariateDistribution, l::Real, u::Real; kwargs...)
return Wrapped(d, promote(l, u)...; kwargs...)
end
function wrapped(d::UnivariateDistribution, l::T, u::T; kwargs...) where {T<:Real}
return Wrapped(d, l, u; kwargs...)
end

"""
Wrapped

Generic wrapper for a wrapped distribution.
"""
struct Wrapped{D<:UnivariateDistribution,S<:ValueSupport,T<:Real} <:
UnivariateDistribution{S}
unwrapped::D # the original distribution (untruncated)
lower::T # lower bound
upper::T # upper bound
characteristic::Bool # whether to use characteristic function
k::Int # k-fold symmetry
function Wrapped(
d::UnivariateDistribution,
l::T,
u::T;
tol::Real=sqrt(eps(float(T))),
characteristic::Bool=_wrapped_use_characteristic(d, l, u, tol),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd thing to store in a field and expose to the user. Naively, it seems like an implementation detail that an average user would not need to necessarily care about or modify. I would expect that whether the characteristic function is used would be up to methods defined for particular Ds in Wrapped{D}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this design seems non-ideal. But this field isn't so much for the user as it is for a developer who has some some analysis of Wrapped{D} for some specific D. In short, the two series representations of pdf have different convergence properties depending on D and its parameters. The one that sums the shifted pdf, etc of D converges quickly when the tails of the distribution quickly decay. The one using the characteristic function decays quickly when D is not very peaked. These conditions are often opposites.

Ideally, this would be automatically chosen for every distribution based on its parameters, but this requires someone to sit down and work out bounds for the number of terms needed for each series representation. Having a field allows this to be done at construction time instead of every time pdf is called. See e.g. Wrapped{Normal}, where we use known bounds to switch between them.

For cases where bounds are not known but the user knows one converges faster than the other for their specific D and its parameters, the field can be useful, but this is not the primary use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that makes sense, thanks. Would it be reasonable then, do you think, to not document the field and remove it altogether once someone has gone through to work out the numbers of terms?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the question. There is no one number of terms. For a given wrapped distribution dist with values of params, one or both of the series will have fewer terms. But if you change any of the params values or change the dist, then the other series might now have fewer terms. The difference can be quite extreme. For wrapped normal, for example, if σ=1e+3, the series using cf needs only 1 term, while the series using pdf needs >2,000 terms. But for σ=1e-3, the series using cf needs >6,000 terms, while the one using pdf needs 1 term. So we need to choose whether to use the pdf series or cf series. Options:

  • always use one or the other. This is bad for the reasons noted above
  • store a field like we do here. The user can select which strategy they want to use, or a developer can overloaded wrapped(::SomeDistribution, ...) if they have a heuristic for choosing the strategy
  • Instead of storing a field, the strategy coul be picked at every evaluation of pdf, cdf, etc. Likely wasteful
  • Instead of choosing, every function evaluates both series in parallel until one of them converges. Potentially runs into issues when cf is not implemented, and does twice as much works as might be necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the question. There is no one number of terms.

Sorry, I was referring to what you had said in the preceding message:

this requires someone to sit down and work out bounds for the number of terms needed for each series representation.

k::Int=1,
) where {T<:Real}
return new{typeof(d),value_support(typeof(d)),T}(
d, l, u, characteristic, k
)
end
end

_wrapped_use_characteristic(d::UnivariateDistribution, l, u, tol) = false

minimum(d::Wrapped) = d.lower
maximum(d::Wrapped) = d.upper - eps(float(d.upper))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Wrapped{D,S,Int}, this will produce different types for minimum and maximum, which seems weird to me. In other cases we use the limit, e.g. when it's unbounded, which I think (again naively) would be acceptable here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a wrapped distribution, d.lower and d.upper are identified as the same point, so for a discrete distribution, including them both will always give the wrong answer if d.upper is in the support of the wrapped distribution. Ideally maximum(d) would return the largest point strictly less than d.upper, but AFAICT there is no function for this in the Distributions API.

See for example the wrapped Poisson example in #1716 (comment). Because 0 and 15 are the same point, the PMF sums to greater than 1 because maximum(d) returned d.upper.

A compromise could be returning d.upper for continuous distributions and this result for discrete ones.

Copy link
Member

@ararslan ararslan May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally maximum(d) would return the largest point strictly less than d.upper

Distributions defines the support of a DiscreteUnivariateDistribution to be round(Int, minimum(d)):round(Int, maximum(d)), which means that it doesn't support supports (🙃) with non-unit steps. That implies that the largest point that's strictly less than d.upper is ceil(Int, d.upper) - 1. Would that work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distributions defines the support of a DiscreteUnivariateDistribution to be round(Int, minimum(d)):round(Int, maximum(d)), which means that it doesn't support supports (upside_down_face) with non-unit steps.

Oh really? But this doesn't seem to be consistently enforced, e.g.

julia> d = Binomial(10, 0.5) * 1//2;

julia> support(d)
0//1:1//2:5//1

julia> d = DiscreteNonParametric(randn(5), normalize(rand(5), 1));

julia> support(d)
5-element Vector{Float64}:
 -0.9108476704164078
 -0.5194957770100691
 -0.024410217492521903
  0.15370458561285605
  1.338024968131234

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, amazing. 🤦 I was going based on the generic definition for DiscreteUnivariateDistribution, I had forgotten about LocationScale and I don't think I even knew DiscreteNonParametric existed. To be fair, I did say I was uninformed!


function logpdf(
d::Wrapped, x::Real; characteristic::Bool=d.characteristic, kwargs...
)
if characteristic
log(pdf_wrapped_cf(d, x; kwargs...))
else
logpdf_wrapped_logpdf(d, x; kwargs...)
end
end

logpdf_wrapped_cf(d::Wrapped, x::Real; kwargs...) = log(pdf_wrapped_cf(d, x; kwargs...))

function logpdf_wrapped_logpdf(
d::Wrapped, x::Real; tol=sqrt(eps(float(typeof(x)))), maxiter=1_000
)
d_unwrapped = d.unwrapped
period = d.upper - d.lower
k = d.k
logp = logpdf(d_unwrapped, x)
insupport(d, x) || return oftype(logp, -Inf)
Δx = period / k
x₋ = x - Δx
x₊ = x + Δx
x₋_in_support = insupport(d_unwrapped, x₋)
x₊_in_support = insupport(d_unwrapped, x₊)
for i in 1:maxiter
Δlogp = oftype(logp, -Inf)
if x₋_in_support
Δlogp = logaddexp(Δlogp, logpdf(d_unwrapped, x₋))
x₋ -= Δx
x₋_in_support = insupport(d_unwrapped, x₋)
end
if x₊_in_support
Δlogp = logaddexp(Δlogp, logpdf(d_unwrapped, x₊))
x₊ += Δx
x₊_in_support = insupport(d_unwrapped, x₊)
end
logp = logaddexp(logp, Δlogp)
i > k && abs(Δlogp) ≤ tol && isfinite(logp) && break
end
return logp - log(oftype(logp, k))
end

function pdf(
d::Wrapped, x::Real; characteristic::Bool=d.characteristic, kwargs...
)
if characteristic
pdf_wrapped_cf(d, x; kwargs...)
else
pdf_wrapped_pdf(d, x; kwargs...)
end
end

function pdf_wrapped_pdf(
d::Wrapped, x::Real; tol=sqrt(eps(float(typeof(x)))), maxiter=1_000
)
d_unwrapped = d.unwrapped
period = d.upper - d.lower
k = d.k
p = pdf(d_unwrapped, x)
insupport(d, x) || return zero(p)
Δx = period / k
x₋ = x - Δx
x₊ = x + Δx
x₋_in_support = insupport(d_unwrapped, x₋)
x₊_in_support = insupport(d_unwrapped, x₊)
for i in 1:maxiter
Δp = zero(p)
if x₋_in_support
Δp += pdf(d_unwrapped, x₋)
x₋ -= Δx
x₋_in_support = insupport(d_unwrapped, x₋)
end
if x₊_in_support
Δp += pdf(d_unwrapped, x₊)
x₊ += Δx
x₊_in_support = insupport(d_unwrapped, x₊)
end
p += Δp
i > k && abs(Δp) ≤ tol && p > 0 && break
end
return p / k
end

function cdf(
d::Wrapped, x::Real; characteristic::Bool=d.characteristic, kwargs...
)
if characteristic && !(d.unwrapped isa DiscreteDistribution)
cdf_wrapped_cf(d, x; kwargs...)
else
cdf_wrapped_cdf(d, x; kwargs...)
end
end

function logcdf(
d::Wrapped, x::Real; characteristic::Bool=d.characteristic, kwargs...
)
if characteristic && !(d.unwrapped isa DiscreteDistribution)
log(cdf_wrapped_cf(d, x; kwargs...))
else
logcdf_wrapped_logcdf(d, x; kwargs...)
end
end

function cdf_wrapped_cdf(
d::Wrapped, x::Real; tol=sqrt(eps(float(typeof(x)))), maxiter=1_000
)
d_unwrapped = d.unwrapped
l = d.lower
period = d.upper - l
k = d.k
p = cdf(d_unwrapped, x) - cdf(d_unwrapped, l)
x < l && return zero(p)
Δx = period / k
x₋ = x - Δx
x₊ = x + Δx
x₋_in_support = insupport(d_unwrapped, x₋)
l₋ = l - Δx
l₊ = l + Δx
l₊_in_support = insupport(d_unwrapped, l₊)
for i in 1:maxiter
Δp = zero(p)
if x₋_in_support
Δp += cdf(d_unwrapped, x₋) - cdf(d_unwrapped, l₋)
if d_unwrapped isa DiscreteDistribution
Δp += pdf(d_unwrapped, l₋)
end
x₋ -= Δx
l₋ -= Δx
x₋_in_support = insupport(d_unwrapped, x₋)
end
if l₊_in_support
Δp += cdf(d_unwrapped, x₊) - cdf(d_unwrapped, l₊)
if d_unwrapped isa DiscreteDistribution
Δp += pdf(d_unwrapped, l₊)
end
x₊ += Δx
l₊ += Δx
l₊_in_support = insupport(d_unwrapped, l₊)
end
p += Δp
i > k && abs(Δp) ≤ tol && p > 0 && break
end
return p / k
end

function logcdf_wrapped_logcdf(
d::Wrapped, x::Real; tol=sqrt(eps(float(typeof(x)))), maxiter=1_000
)
d_unwrapped = d.unwrapped
l = d.lower
period = d.upper - l
k = d.k
x < l && return oftype(logcdf(d_unwrapped, l), -Inf)
logp = logdiffcdf(d_unwrapped, x, l)
x < l && return oftype(logp, -Inf)
Δx = period / k
x₋ = x - Δx
x₊ = x + Δx
x₋_in_support = insupport(d_unwrapped, x₋)
l₋ = l - Δx
l₊ = l + Δx
l₊_in_support = insupport(d_unwrapped, l₊)
for i in 1:maxiter
Δlogp = oftype(logp, -Inf)
if x₋_in_support
Δlogp = logaddexp(Δlogp, logdiffcdf(d_unwrapped, x₋, l₋))
if d_unwrapped isa DiscreteDistribution
Δlogp = logaddexp(Δlogp, logpdf(d_unwrapped, l₋))
end
x₋ -= Δx
l₋ -= Δx
x₋_in_support = insupport(d_unwrapped, x₋)
end
if l₊_in_support
Δlogp = logaddexp(Δlogp, logdiffcdf(d_unwrapped, x₊, l₊))
if d_unwrapped isa DiscreteDistribution
Δlogp = logaddexp(Δlogp, logpdf(d_unwrapped, l₊))
end
x₊ += Δx
l₊ += Δx
l₊_in_support = insupport(d_unwrapped, l₊)
end
logp = logaddexp(logp, Δlogp)
i > k && abs(Δlogp) ≤ tol && isfinite(logp) && break
end
return min(logp - log(oftype(logp, k)), 0)
end

function pdf_wrapped_cf(d::Wrapped, x::Real; tol=sqrt(eps(float(typeof(x)))), maxiter=1_000)
d_unwrapped = d.unwrapped
period = d.upper - d.lower
k = d.k
scale = twoπ * (k / period)
iscale = zero(scale)
z₀ = cis(-x * scale)
z = one(z₀)
p = real(cf(d_unwrapped, iscale) * z)
insupport(d, x) || return zero(p)
for i in 1:maxiter
z *= z₀
iscale += scale
Δp = 2 * real(cf(d_unwrapped, iscale) * z)
p += Δp
i > k && abs(Δp) ≤ tol && p > 0 && break
end
# ensure density is never negative
return max(0, p) / period
end

function cdf_wrapped_cf(d::Wrapped, x::Real; tol=sqrt(eps(float(typeof(x)))), maxiter=1_000)
d_unwrapped = d.unwrapped
period = d.upper - d.lower
k = d.k
scale = twoπ * (k / period)
iscale = zero(scale)
z₀ = cis(-x * scale)
z₀ₗ = cis(-d.lower * scale)
z = zₗ = one(z₀)
p = real(cf(d_unwrapped, iscale)) * (x - d.lower)
for i in 1:maxiter
z *= z₀
zₗ *= z₀ₗ
iscale += scale
Δp = 2 * imag(cf(d_unwrapped, iscale) * (zₗ - z)) / iscale
p += Δp
i > k && abs(Δp) ≤ tol && p > 0 && break
end
# ensure probability is never negative
return clamp(p / period, 0, 1)
end

function rand(rng::AbstractRNG, d::Wrapped)
l = d.lower
k = d.k
period = d.upper - l
# choose which of the `k` folds the point will be from
i = k == 1 ? k : rand(rng, 1:k)
z = rand(rng, d.unwrapped)
return mod(z - l + ((k - i)//k) * period, period) + l
end

### specialized wrapped distributions

include(joinpath("wrapped", "normal.jl"))
include(joinpath("wrapped", "cauchy.jl"))
include(joinpath("wrapped", "exponential.jl"))
47 changes: 47 additions & 0 deletions src/wrapped/cauchy.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
function pdf(d::Wrapped{<:Cauchy}, x::Real; kwargs...)
d0 = d.unwrapped
l = d.lower
u = d.upper
period = u - l
scale = d.k * 2π / period
s = d0.σ * scale
z = (x - d0.μ) * scale
p = sinh(s) / (cosh(s) - cos(z)) / period
l ≤ x < u || return zero(p)
return p
end

function logpdf(d::Wrapped{<:Cauchy}, x::Real; kwargs...)
d0 = d.unwrapped
l = d.lower
u = d.upper
period = u - l
scale = d.k * 2π / period
s = d0.σ * scale
z = (x - d0.μ) * scale
logp = log1mexp(-2s) - logaddexp(2log1mexp(-s), logtwo - s + log1p(-cos(z)))
l ≤ x < u || return oftype(logp, -Inf)
return logp - log(oftype(logp, period))
end

function cdf(d::Wrapped{<:Cauchy}, x::Real; kwargs...)
d0 = d.unwrapped
l = d.lower
u = d.upper
k = d.k
period = u - l
scale = k * π / period
s = d0.σ * scale
z = (x - d0.μ) * scale
zₗ = (l - d0.μ) * scale
tanhs = tanh(s)
p = (atan(tan(z), tanhs) - atan(tan(zₗ), tanhs)) / π / k
x < l && return zero(p)
x ≥ u && return one(p)
# handle discontinuities in the antiderivative at tan((2i+1)/2 π) for integers i
num_discontinuities = Int(div(z, π, RoundNearest)) - Int(div(zₗ, π, RoundNearest))
return p + num_discontinuities // k
end

logcdf(d::Wrapped{<:Cauchy}, x::Real; kwargs...) = log(cdf(d, x; kwargs...))