Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couple documentation fixes #1847

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

cgarling
Copy link

@cgarling cgarling commented Apr 2, 2024

Have some time so fixing some of the documentation issues like #1845. Will try to document things I notice but don't feel qualified to fix.

Comment on lines +34 to +35
pdf(::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:Real,M}) where {N,M}
logpdf(::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:Real,M}) where {N,M}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure... This page is about MultivariateDistribution but the methods listed here are for generic distributions with array-like variates. I think it would be better to include these generic docstrings in a separate page and only list here pdf, logpdf etc. as part of the interface for multivariate distributions with a link to the generic docstring.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a valid concern, I was just following the docstring for loglikelihood for array-like variates which is currently listed on the multivariate page, so I thought it would be consistent.

"""
loglikelihood(d::Distribution{ArrayLikeVariate{N}}, x) where {N}
The log-likelihood of distribution `d` with respect to all variate(s) contained in `x`.
Here, `x` can be any output of `rand(d, dims...)` and `rand!(d, x)`. For instance, `x` can
be
- an array of dimension `N` with `size(x) == size(d)`,
- an array of dimension `N + 1` with `size(x)[1:N] == size(d)`, or
- an array of arrays `xi` of dimension `N` with `size(xi) == size(d)`.
"""

If you think these should be taken to a separate page, then loglikelihood probably should as well

@@ -73,7 +73,7 @@ pdfsquaredL2norm
insupport(::UnivariateDistribution, x::Any)
pdf(::UnivariateDistribution, ::Real)
logpdf(::UnivariateDistribution, ::Real)
loglikelihood(::UnivariateDistribution, ::AbstractArray)
loglikelihood(::UnivariateDistribution, ::Real)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You almost never want to call loglikelihood with ::Real but rather with ::AbstractArray, so documenting the former instead of the latter seems wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is that method?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. It is not obvious to me why the ::AbstractArray signature would be preferred over broadcasting through the ::Real signature for UnivariateDistribution. My naive go-to would be the ::Real method as it is available for all UnivariateDistribution and the ::AbstractArray signature is not. You could make the docstring more useful by adding a sentence explaining why you may or may not want to use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The canonical method for single variates is just logpdf. The real benefit of loglikelihood is that it sums the logpdf values for multiple variates, possibly using exploiting constant terms etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loglikelihood is defined here:

Base.@propagate_inbounds @inline function loglikelihood(
d::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:Real,M},
) where {N,M}
if M == N
return logpdf(d, x)
else
@boundscheck begin
M > N ||
throw(DimensionMismatch(
"number of dimensions of the variates ($M) must be greater than or equal to the dimension of the distribution ($N)"
))
ntuple(i -> size(x, i), Val(N)) == size(d) ||
throw(DimensionMismatch("inconsistent array dimensions"))
end
return @inbounds sum(Base.Fix1(logpdf, d), eachvariate(x, ArrayLikeVariate{N}))
end
end
Base.@propagate_inbounds function loglikelihood(
d::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:AbstractArray{<:Real,N}},
) where {N}
return sum(Base.Fix1(logpdf, d), x)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was confused about "ArrayLikeVariate{N}", does this include univariate distributions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, it calls down to

loglikelihood(d::Distribution{ArrayLikeVariate{N}}, x) where {N}
The log-likelihood of distribution `d` with respect to all variate(s) contained in `x`.
Here, `x` can be any output of `rand(d, dims...)` and `rand!(d, x)`. For instance, `x` can
be
- an array of dimension `N` with `size(x) == size(d)`,
- an array of dimension `N + 1` with `size(x)[1:N] == size(d)`, or
- an array of arrays `xi` of dimension `N` with `size(xi) == size(d)`.
"""
Base.@propagate_inbounds @inline function loglikelihood(
d::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:Real,M},
) where {N,M}
if M == N
return logpdf(d, x)
else
@boundscheck begin
M > N ||
throw(DimensionMismatch(
"number of dimensions of the variates ($M) must be greater than or equal to the dimension of the distribution ($N)"
))
ntuple(i -> size(x, i), Val(N)) == size(d) ||
throw(DimensionMismatch("inconsistent array dimensions"))
end
return @inbounds sum(Base.Fix1(logpdf, d), eachvariate(x, ArrayLikeVariate{N}))
end
end
Base.@propagate_inbounds function loglikelihood(
d::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:AbstractArray{<:Real,N}},
) where {N}
return sum(Base.Fix1(logpdf, d), x)
end

because UnivariateDistribution <: Distribution{ArrayLikeVariate{0}}
For some reason I thought the univariate case was defined differently...

It seems like the real issue might be how we go about linking these general methods into the separate markdown pages, as the current @docs strings no longer work with the new methods. I don't really feel comfortable doing that myself, I don't understand the internals all that well (as shown above...)

Comment on lines +329 to +335
"""
loglikelihood(d::UnivariateDistribution, x::Real)

Evaluate the logarithm of the likelihood at `x`.

See also: [`logpdf`](@ref).
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this is just a fallback for the case of a single variate but not the typical use case of loglikelihood.

@cgarling
Copy link
Author

cgarling commented Apr 2, 2024

Some other minor docs issues I noted that I could work on with some guidance:

  • Some local links like [Beta distribution](@ref Beta) in the docstring for Kumaraswamy aren't working for some reason I can't figure out
    It is related to the [Beta distribution](@ref Beta) by the following identity:
    if ``X \\sim \\operatorname{Kumaraswamy}(a, b)`` then ``X^a \\sim \\operatorname{Beta}(1, b)``.
    In particular, if ``X \\sim \\operatorname{Kumaraswamy}(1, 1)`` then
    ``X \\sim \\operatorname{Uniform}(0, 1)``.
  • There are @docs entries for _logpdf in multivariate.md and matrix.md but no docstrings in the source. There is a commented out method in common.jl, but I'm not sure where a docstring should be added or the references should be removed
    # `_logpdf` should be implemented and has no default definition
    # _logpdf(d::Distribution{ArrayLikeVariate{N}}, x::AbstractArray{<:Real,N}) where {N}
  • @docs entry for _rand! but no docstring; not sure what this does so can't write one
    Distributions._rand!(::AbstractRNG, ::MatrixDistribution, A::AbstractMatrix)
  • Warnings for references [`Distribution`](@ref), [`ReshapedDistribution`](@ref), [`MultivariateDistribution`](@ref) in reshape.md. MultivariateDistribution appears not to have a docstring. The other two do, but they don't seem to have @docs entries and so the linking fails. I feel like MultivariateDistribution get a docstring and have an @docs entry at the top of multivariate.md and ReshapedDistribution should likewise have an @docs entry at the top of reshape.md. I would guess that Distribution should get an @docs entry on the "Type Hierarchy" page under the "Distributions" heading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants