-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Useful packages for (sparse) differentiation? #229
Comments
@gdalle We just need to update |
Sounds good! We're currently in the middle of a big refactor for SparseConnectivityTracer.jl, so it will have to wait a few days |
The refactor has been merged and SparseConnectivityTracer PRs adding |
@adrhill @gdalle
MWE: using SparseConnectivityTracer
y = zeros(3)
x = rand(3)
function J(y, x)
y[1] = x[1]^2
y[2] = 2 * x[1] * x[2]^2
y[3] = sin(x[3])
return y
end
S = jacobian_pattern(J, y, x)
x = rand(3)
y = rand(2)
f(x) = 2 * x[1] * x[2]^2
c(x) = [x[3]^3; x[3]^2]
S = hessian_pattern(f, x)
ℓ(x) = f(x) + dot(c(x), y)
S = hessian_pattern(ℓ, x) ERROR: MethodError: no method matching conj(::SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}})
Closest candidates are:
conj(::Missing)
@ Base missing.jl:101
conj(::Complex)
@ Base complex.jl:282
conj(::SymTridiagonal)
@ LinearAlgebra ~/julia/julia-1.10.3/share/julia/stdlib/v1.10/LinearAlgebra/src/tridiag.jl:160
...
Stacktrace:
[1] dot(x::SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}, y::Float64)
@ LinearAlgebra ~/julia/julia-1.10.3/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:876
[2] dot(x::Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}}, y::Vector{Float64})
@ LinearAlgebra ~/julia/julia-1.10.3/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:886
[3] ℓ(x::Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}})
@ Main ./REPL[41]:1
[4] trace_function(::Type{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{…}}}}, f::typeof(ℓ), x::Vector{Float64})
@ SparseConnectivityTracer ~/.julia/packages/SparseConnectivityTracer/JvHcU/src/pattern.jl:32
[5] hessian_pattern(f::Function, x::Vector{Float64}, ::Type{BitSet}, ::Type{Set{Tuple{Int64, Int64}}})
@ SparseConnectivityTracer ~/.julia/packages/SparseConnectivityTracer/JvHcU/src/pattern.jl:326
[6] top-level scope
@ REPL[42]:1
Some type information was truncated. Use `show(err)` to see complete types. If I also preallocate the storage of function compute_hessian_sparsity(f, nvar, c!, ncon)
if ncon == 0
x0 = rand(nvar)
S = SparseConnectivityTracer.hessian_pattern(f, x0)
return S
else
x0 = rand(nvar)
y0 = rand(ncon)
cx = similar(y0)
ℓ(x) = f(x) + sum(c!(cx, x)[i] * y0[i] for i = 1:ncon)
S = SparseConnectivityTracer.hessian_pattern(ℓ, x0)
return S
end
end Basic Jacobian derivative with backend=ADNLPModels.SparseADJacobian and T=Float64: Error During Test at /home/alexis/Bureau/git/ADNLPModels.jl/test/sparse_jacobian.jl:6
Got exception outside of a @test
TypeError: in typeassert, expected Float64, got a value of type SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}
Stacktrace:
[1] setindex!(A::Vector{Float64}, x::SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}, i1::Int64)
@ Base ./array.jl:1021
[2] (::var"#c!#21")(cx::Vector{Float64}, x::Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}})
@ Main ~/Bureau/git/ADNLPModels.jl/test/sparse_jacobian.jl:10
[3] (::ADNLPModels.var"#35#37"{Vector{SparseConnectivityTracer.HessianTracer{BitSet, Set{Tuple{Int64, Int64}}}}, var"#c!#21", Vector{Float64}, Vector{Float64}})(i::Int64) Maybe the best solution is to not provide |
Hey @gdalle @adrhill ! I love the idea and agree with @amontoison the sparse Jacobian/Hessian is a good starting point. For the DifferentiationInterface.jl, the best would be if you could provide an example that does this: T = Float64
n = 2
x0 = rand(T, n)
f(x) = sum(x) # or something
function c!(cx, x)
cx[1] = x[1]^2 + x[2]^2 + x[1] * x[2]
# a) compute gradient in-place
gx = similar(x0)
# ... ∇f(x)
# b) Hessian-vector product in-place
v = ones(2)
# ... ∇f²(x)*v
# c) Jacobian-vector product
# ∇c!(cx, x) * v
# d) transpose Jacobian-vector product
# ∇c!(cx, x)ᵀ * v
# e) Lagrangian Hessian-vector product
# ... (∇f²(x) + y_i ∇²cᵢ!(cx, x))*v Usually, we store everything you need to compute the derivatives in a backend that is initialized with the model. |
Here are some answers to your questions:
The rationale behind it is that
At the moment we do not exploit the symmetry of the Hessian. Internally, the Hessian pattern is represented as a
Essentially we need to add some array rules to SparseConnectivityTracer, and
For SparseConnectivityTracer you can always detect the sparsity pattern of For DifferentiationInterface, @timholy asked a similar question a week ago, and he had some suggestions:
Feel free to weigh in on those discussions.
Don't provide |
@tmigot here's an example doing everything you asked. The mantra of DifferentiationInterface is "prepare once, reuse multiple times", so it fits well with an optimization point of view. using DifferentiationInterface
import Enzyme, ForwardDiff, Zygote
forward_backend = AutoForwardDiff()
reverse_backend = AutoEnzyme(Enzyme.Reverse) # supports mutation
second_order_backend = SecondOrder(AutoForwardDiff(), AutoZygote())
T = Float64
n = 2 # variables
m = 1 # constraints
x0 = rand(T, n)
obj(x) = sum(x)
function cons!(c, x)
c[1] = sum(abs2, x) - one(eltype(x))
return nothing
end
# a) gradient
extras_a = prepare_gradient(obj, reverse_backend, x0)
g0 = similar(x0)
gradient!(obj, g0, reverse_backend, x0, extras_a)
# b) Hessian-vector product
extras_b = prepare_hvp(obj, second_order_backend, x0, v0)
p0 = similar(x0)
hvp!(obj, p0, second_order_backend, x0, v0, extras_b)
# c) Jacobian-vector product
c0 = ones(T, m)
extras_c = prepare_pushforward(cons!, c0, forward_backend, x0, v0)
p0 = similar(c0)
pushforward!(cons!, c0, p0, forward_backend, x0, v0, extras_c)
# d) transpose Jacobian-vector product
extras_d = prepare_pullback(cons!, c0, reverse_backend, x0, v0)
p0 = similar(x0)
pullback!(cons!, c0, p0, reverse_backend, x0, v0, extras_d)
# e) Lagrangian Hessian-vector product
# not sure what you mean by this one, can either be obtained by summing the HVPs of `f` and `ci` or by taking the HVP of the Lagrangian as a whole I think, as above, the main point of discussion will be handling the Lagrangian aspect |
@amontoison as of v0.5.6 (released this morning), DifferentiationInterface supports vector-mode pushforwards, pullbacks and HVPs, which makes Jacobians and Hessians much faster. It only works with ForwardDiff for now, but I want to add Enzyme soon. |
@gdalle We can do a meeting next week if you are available to talk about that. |
Unfortunately I'll be at JuliaCon all week! The week after? |
Yes, perfect. |
Hey there friends of JSO!
@adrhill and I have been hard at work on two autodiff toolkits which you might find useful:
We would be happy to discuss potential uses, and integration with your ecosystem. We're having similar talks with Optimization.jl and the SciML organization at large.
Ping @amontoison @abelsiqueira since we've already chatted a few times :)
The text was updated successfully, but these errors were encountered: