Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid closure in LV system by passing parameters as kwargs #834

Closed
ghost opened this issue Jun 20, 2023 · 11 comments
Closed

Avoid closure in LV system by passing parameters as kwargs #834

ghost opened this issue Jun 20, 2023 · 11 comments

Comments

@ghost
Copy link

ghost commented Jun 20, 2023

Hi,

I would like to know if it is possible to avoid the closure relationship in the Lotka-Volterra system

# Closure with the known parameter
nn_dynamics!(du, u, p, t) = ude_dynamics!(du, u, p, t, p_)

In order to pass p_ in the remake function of the predict method:

_prob = remake(prob_nn, u0 = X, tspan = (T[1], T[end]), p = θ)

Say, put p_ as a kwarg of ude_dynamics! and update it in the remake function for prob_nn ?
Is this possible? If so, how can it be achieved?

I have tried the following:

function ude_dynamics!(du, u, p, t; p_true=p_true)
    û = U(u, p, st)[1] # Network prediction
    du[1] = p_true[1] * u[1] + û[1]
    du[2] = -p_true[4] * u[2] + û[2]
end

prob_nn = ODEProblem(ude_dynamics!, Xₙ[:, 1], tspan, p)

function predict(θ, X = Xₙ[:, 1], T = t)
    _prob = remake(prob_nn, u0 = X, tspan = (T[1], T[end]), p = θ; p_true=[1,2,3,4]) # change the value on purpose
    Array(solve(_prob, Vern7(), saveat = T,
                abstol = 1e-6, reltol = 1e-6))
end

But this gives an error.

Unrecognized keyword arguments: [:p_true]
┌ Warning: Unrecognized keyword arguments found. Future versions will error.
│ The only allowed keyword arguments to `solve` are:
│ (:dense, :saveat, :save_idxs, :tstops, :tspan, :d_discontinuities, :save_everystep, :save_on, :save_start, :save_end, :initialize_save, :adaptive, :abstol, :reltol, :dt, :dtmax, :dtmin, :force_dtmin, :internalnorm, :controller, :gamma, :beta1, :beta2, :qmax, :qmin, :qsteady_min, :qsteady_max, :qoldinit, :failfactor, :calck, :alias_u0, :maxiters, :callback, :isoutofdomain, :unstable_check, :verbose, :merge_callbacks, :progress, :progress_steps, :progress_name, :progress_message, :timeseries_errors, :dense_errors, :weak_timeseries_errors, :weak_dense_errors, :wrap, :calculate_error, :initializealg, :alg, :save_noise, :delta, :seed, :alg_hints, :kwargshandle, :trajectories, :batch_size, :sensealg, :advance_to_tstop, :stop_at_next_tstop, :default_set, :second_time, :prob_choice, :alias_jump, :alias_noise, :batch)

Which indicates that I can only change kwargs related to the solve. Is there a way to change the kwargs from the function inside ODEProblem?

@prbzrg
Copy link
Member

prbzrg commented Jun 20, 2023

Using ComponentArrays.jl, we can have multiple parameters arrays as a single array to pass it as p.

@ghost
Copy link
Author

ghost commented Jun 20, 2023

Using ComponentArrays.jl, we can have multiple parameters arrays as a single array to pass it as p.

I am not getting it. Could you please illustrate it with the example above (tutorial example)

@prbzrg
Copy link
Member

prbzrg commented Jun 20, 2023

Like:

using ComponentArrays

# Define the hybrid model
function ude_dynamics!(du, u, p, t)
    û = U(u, p.p, st)[1] # Network prediction
    du[1] = p.p_true[1] * u[1] + û[1]
    du[2] = -p.p_true[4] * u[2] + û[2]
end

# Closure with the known parameter
nn_dynamics!(du, u, p, t) = ude_dynamics!(du, u, ComponentArray(p=p, p_true=p_), t)
# Define the problem
prob_nn = ODEProblem(nn_dynamics!, Xₙ[:, 1], tspan, p)

For update:

remake(prob_nn, p = ComponentArray(p=θ, p_true=[1,2,3,4]))

I didn't test it, though; so maybe I'm wrong.

@ghost
Copy link
Author

ghost commented Jun 20, 2023

That code is giving me the following error:

type NamedTuple has no field layer_1

Stacktrace:
  [1] getproperty
     ./Base.jl:37 [inlined]
  [2] getindex(#unused#::Axis{(p = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), p_true = 88:91)}, s::Symbol)
     ComponentArrays ~/.julia/packages/ComponentArrays/0GdGd/src/axis.jl:157
  [3] _broadcast_getindex_evalf
     ./broadcast.jl:683 [inlined]
  [4] _broadcast_getindex
     ./broadcast.jl:656 [inlined]
  [5] (::Base.Broadcast.var"#31#32"{Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple}, Nothing, typeof(getindex), Tuple{Tuple{Axis{(p = ViewAxis(1:87, Axis(layer_1 = ViewAxis(1:15, Axis(weight = ViewAxis(1:10, ShapedAxis((5, 2), NamedTuple())), bias = ViewAxis(11:15, ShapedAxis((5, 1), NamedTuple())))), layer_2 = ViewAxis(16:45, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_3 = ViewAxis(46:75, Axis(weight = ViewAxis(1:25, ShapedAxis((5, 5), NamedTuple())), bias = ViewAxis(26:30, ShapedAxis((5, 1), NamedTuple())))), layer_4 = ViewAxis(76:87, Axis(weight = ViewAxis(1:10, ShapedAxis((2, 5), NamedTuple())), bias = ViewAxis(11:12, ShapedAxis((2, 1), NamedTuple())))))), p_true = 88:91)}}, Base.Broadcast.Broadcasted{Base.Broadcast.Style{Tuple}, Nothing, typeof(ComponentArrays.getval), Tuple{Tuple{DataType}}}}}})(k::Int64)
     Base.Broadcast ./broadcast.jl:1088
  [6] ntuple
     ./ntuple.jl:48 [inlined]
  [7] copy
     ./broadcast.jl:1088 [inlined]
  [8] materialize(bc::Base.Broadcast.Broadcas

@ghost
Copy link
Author

ghost commented Jun 21, 2023

Any suggestion on how to solve this?

Edit:

Apparently this solved the issue: (could you please confirm)

# Define the hybrid model
function ude_dynamics!(du, u, p, t)
    û = U(u, p.p, st)[1] # Network prediction
    du[1] = p.p_true[1] * u[1] + û[1]
    du[2] = -p.p_true[4] * u[2] + û[2]
end

# Closure with the known parameter
nn_dynamics!(du, u, p, t) = ude_dynamics!(du, u, p, t)  # This is not doing anything... but I left it to use the name `nn_dynamics!`
# Define the problem
prob_nn = ODEProblem(nn_dynamics!, Xₙ[:, 1], tspan, ComponentArray(p=p, p_true=p_))
function predict(θ, X = Xₙ[:, 1], T = t)
    _prob = remake(prob_nn, u0 = X, tspan = (T[1], T[end]), p = ComponentArray(p=θ, p_true=[1.3, 0.9, 0.8, 1.8]))
    Array(solve(_prob, Vern7(), saveat = T,
                abstol = 1e-6, reltol = 1e-6))
end

The rest is as in the tutorial. [the results with this approach have slight numerical differences to the original version(which I do not know why)]
Will create the ComponentArray everytime in the predictmethod kill the performance?

@ghost
Copy link
Author

ghost commented Jun 22, 2023

@prbzrg any idea why the numerical differences? and will create the ComponentArray everytime in the predictmethod kill the performance?

@avik-pal
Copy link
Member

Yeah creating a CA everytime does have some cost, because it needs to allocate an entirely new vector. A workaround I have been using in certain contexts is a "lazy" version of component array. Something like:

using ChainRulesCore
import ChainRulesCore as CRC
const ∂∅ = NoTangent()
const ∂0 = ZeroTangent()

struct InputInjectionParameters{T, X, P <: ComponentArray} <: AbstractArray{T, 1}
    x::X
    p::P
end

function InputInjectionParameters(x, ps)
    @assert eltype(ps) == eltype(x)
    return InputInjectionParameters{eltype(x), typeof(x), typeof(ps)}(x, ps)
end

get_input(x::InputInjectionParameters) = x.x
get_parameters(x::InputInjectionParameters) = x.p

Base.size(x::InputInjectionParameters) = (length(x.x) + length(x.p),)

function Base.getindex(x::InputInjectionParameters, i::Int)
    return i  length(x.x) ? x.x[i] : x.p[i - length(x.x)]
end

Base.IndexStyle(::Type{<:InputInjectionParameters}) = IndexLinear()

function CRC.rrule(::typeof(get_input), x::InputInjectionParameters)
    function get_input_pullback(Δ)
        _has_no_gradient(Δ) && return (∂∅, ∂∅)
        return ∂∅, InputInjectionParameters(Δ, zero(x.p))
    end
    return x.x, get_input_pullback
end

function CRC.rrule(::typeof(get_parameters), x::InputInjectionParameters)
    function get_parameters_pullback(Δ)
        _has_no_gradient(Δ) && return (∂∅, ∂∅)
        return ∂∅, InputInjectionParameters(zero(x.x), Δ)
    end
    return x.p, get_parameters_pullback
end

function CRC.rrule(::Type{<:InputInjectionParameters}, x, p)
    arr = InputInjectionParameters(x, p)
    function InputInjectionParametersPullback(Δarr::AbstractVector)
        ∂x = reshape(Δarr[1:length(x)], size(x))
        ∂p = ComponentArray(Δarr[(length(x) + 1):end], getaxes(p))
        return ∂∅, ∂x, ∂p
    end
    return arr, InputInjectionParametersPullback
end

You can pass it in as InputInjectionParameters(p_true, p) instead of ComponentArray(p=p, p_true=p_). Additionally, you need to modify the function as:

function ude_dynamics!(du, u, p::InputInjectionParameters, t)
    p_true = get_input(p)
    ps = get_parameters(p)
    û = U(u, ps, st)[1] # Network prediction
    du[1] = p_true[1] * u[1] + û[1]
    du[2] = -p_true[4] * u[2] + û[2]
end

I haven't tested this, so might need some additional changes. But this overall gets rid of the cost of creating the componentarray.

@avik-pal
Copy link
Member

This is just a band-aid solution. In certain cases, we might just want to create a LazyComponentArray (which indexes into the original arrays but without actually creating a flattened structure) but that requires a bit more engineering.

@ghost
Copy link
Author

ghost commented Jun 22, 2023

Hi @avik-pal,

Thanks for the suggestions.

I have also been tinkering with a solution of my own.

It is:
Create a global variable const p_true = Ref([0.0, 0.0, 0.0, 0.0])
Although that is seems weird that you can change the value of a const something. I guess Julia specifies constness to the type and not the value.

And update it in predict

function predict(θ, X = Xₙ[:, 1], T = t)
     p_true[] = [1.3, 0.9, 0.8, 1.8]
    _prob = remake(prob_nn, u0 = X, tspan = (T[1], T[end]), p = θ) 
    Array(solve(_prob, Vern7(), saveat = T,
                abstol = 1e-6, reltol = 1e-6))
end

And use it in:

function ude_dynamics!(du, u, p, t; p_true=p_true)
    û = U(u, p, st)[1] # Network prediction
    du[1] = p_true[][1] * u[1] + û[1]
    du[2] = -p_true[][4] * u[2] + û[2]
end

This appears to work, although I do not know if it is good practice/ possible implications

@ghost
Copy link
Author

ghost commented Aug 2, 2023

The rest is as in the tutorial. [the results with this approach have slight numerical differences to the original version(which I do not know why)] Will create the ComponentArray everytime in the predictmethod kill the performance?

Does anyone know where the numerical differences come from?

I remade this approach for a different case and I still get slight numerical differences.

@ChrisRackauckas
Copy link
Member

The latest version of the tutorial does fine with type stability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants