implement custom gradient with multi-argument functions #197

KnutAM · 2023-03-28T20:04:24Z

From Slack-comment by @koehlerson; how to implement custom gradient calculation for a multi-argument function.
It is common to have such a case for autodiff, so would be good to have a clear way of doing this.
The solution I can come up with now is

using Tensors
import ForwardDiff: Dual

# General setup for any function f(x, args...)
struct Foo{F,T<:Tuple} <: Function # <:Function optional
    f::F
    args::T
end
struct FooGrad{FT<:Foo} <: Function # <: Function required
    foo::FT
end

function (foo::Foo)(x)
    println("Foo with Any: ", typeof(x))  # To show that it works
    return foo.f(x, foo.args...)
end
function (foo::Foo)(x::AbstractTensor{<:Any,<:Any,<:Dual})
    println("Foo with Dual: ", typeof(x))  # To show that it works
    return Tensors._propagate_gradient(FooGrad(foo), x)
end
function (fg::FooGrad)(x)
    println("FooGrad: ", typeof(x)) # To show that it works
    return f_dfdx(fg.foo.f, x, fg.foo.args...)
end

# Specific example to setup for bar(x, a, b), must then also define f_dfdx(::typeof(bar), x, a, b):
bar(x, a, b) = norm(a*x)^b 
dbar_dx(x, a, b) = b*(a^b)*norm(x)^(b-2)*x
f_dfdx(::typeof(bar), args...) = (bar(args...), dbar_dx(args...))

# At the location in the code where the derivative will be calculated
t = rand(SymmetricTensor{2,3}); a = π; b = 2 # Typically inputs
foo = Foo(bar, (a, b))
gradient(foo, t) == dbar_dx(t, a, b)

But it is quite cumbersome, especially if only needed for one function, so a better method would be good.
(Tensors._propagate_gradient is renamed to propagate_gradient, exported, and documented in #181)

The text was updated successfully, but these errors were encountered:

KristofferC · 2023-03-28T22:08:46Z

I don't understand why a closure over a and b wouldn't work here.

x->bar(x, a, b)

KnutAM · 2023-03-29T06:34:18Z

I'm not sure that I follow, that would only define one function for x::Any. Do you have a complete example?
If working directly on bar, I think it is necessary to write a custom propagate_gradient using Tensors._extract_value and Tensors._insert_gradient. Alternatively, we could extend that to accept args...:

function propagate_gradient(f_dfdx::Function, x::Union{AbstractTensor{<:Any, <:Any, <:Dual}, Dual}, args...)
    fval, dfdx_val = f_dfdx(_extract_value(x), args...)
    _check_gradient_shape(fval,x,dfdx_val)
    return _insert_gradient(fval, dfdx_val, x)
end

KristofferC · 2023-03-29T06:36:53Z

Okay, I missed the point:

implement custom gradient calculation for a multi-argument function.

Carry on..

koehlerson · 2023-03-29T06:48:11Z

Initially I planned to do a custom layer for energy densities something like

energy(F,material,state) = #something

analytic_or_AD(energy::FUN, F, material, state) where FUN<:Function = Tensors.hessian(x->energy(x,material,state),F)

where a generic dispatch uses Tensors.hessian and for known analytic parts you call another dispatch. However, @implement_gradient should be capable of handling this imo. Further it feels that I reinvent the wheel. I don't think that the dispatchwise approach could substitute only pieces of the derivative, so mix and match analytic and automatic differentiation when the energy function calls again something which is known analytically as e.g. strain energy densities

KnutAM · 2023-03-29T06:48:52Z

But I think the approach of allowing args... in propagate_gradient could be nice for this:

using Tensors
import Tensors: _extract_value, _insert_gradient, Dual
# Change in Tensors.jl
function propagate_gradient(f_dfdx::Function, x::Union{AbstractTensor{<:Any, <:Any, <:Dual}, Dual}, args...)
    fval, dfdx_val = f_dfdx(_extract_value(x), args...)
    # _check_gradient_shape(fval,x,dfdx_val) # PR181
    return _insert_gradient(fval, dfdx_val, x)
end

# User code:
# - Definitions
bar(x, a, b) = norm(a*x)^b
dbar_dx(x, a, b) = b*(a^b)*norm(x)^(b-2)*x
bar_dbar_dx(x, a, b) = (bar(x, a, b), dbar_dx(x, a, b))
bar(x::AbstractTensor{<:Any, <:Any, <:Dual}, args...) = (println("DualBar"); propagate_gradient(bar_dbar_dx, x, args...))
# - At call-site
t = rand(SymmetricTensor{2,3}); a = π; b = 2 # Typically inputs
gradient(x->bar(x, a, b), t)

KnutAM changed the title ~~implement gradient with multi-argument functions~~ implement custom gradient with multi-argument functions Mar 29, 2023

KnutAM mentioned this issue Mar 29, 2023

Make autodiff insertion more versatile #181

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement custom gradient with multi-argument functions #197

implement custom gradient with multi-argument functions #197

KnutAM commented Mar 28, 2023 •

edited

Loading

KristofferC commented Mar 28, 2023 •

edited

Loading

KnutAM commented Mar 29, 2023

KristofferC commented Mar 29, 2023

koehlerson commented Mar 29, 2023

KnutAM commented Mar 29, 2023

implement custom gradient with multi-argument functions #197

implement custom gradient with multi-argument functions #197

Comments

KnutAM commented Mar 28, 2023 • edited Loading

KristofferC commented Mar 28, 2023 • edited Loading

KnutAM commented Mar 29, 2023

KristofferC commented Mar 29, 2023

koehlerson commented Mar 29, 2023

KnutAM commented Mar 29, 2023

KnutAM commented Mar 28, 2023 •

edited

Loading

KristofferC commented Mar 28, 2023 •

edited

Loading