-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance problem with interpolate #47
Comments
So. What is happening. Looking at julia> field_ = element["youngs modulus"]
julia> @btime FEMBase.interpolate($field_, 0.0)
1.099 ns (0 allocations: 0 bytes) When looking julia> @btime FEMBase.interpolate_field($element, $field_, $ip, 1.0)
1.466 ns (0 allocations: 0 bytes) The actual function which gets triggered, is function interpolate(element::Element, field_name, ip, time)
field = element[field_name]
return interpolate_field(element, field, ip, time)
end julia> @btime FEMBase.interpolate($element, "youngs modulus", $ip, 1.0)
56.816 ns (2 allocations: 32 bytes) It looks that the problem is with abstract type abstract type AbstractElement end
struct NewElement1 <: AbstractElement
fields :: Dict{String, DCTI{Float64}}
end
struct NewElement2 <: AbstractElement
fields :: Dict{String, AbstractField}
end
function FEMBase.interpolate_field(element::AbstractElement, field, ip, time)
return field.data
end
function FEMBase.interpolate(element::AbstractElement, field_name, ip, time)
field = element.fields[field_name]
return FEMBase.interpolate_field(element, field, ip, time)
end
element1 = NewElement1(Dict("youngs modulus" => field_))
element2 = NewElement2(Dict("youngs modulus" => field_))
@btime interpolate($element1, "youngs modulus", $ip, 1.0)
@btime interpolate($element2, "youngs modulus", $ip, 1.0) Results the following:
(Btw, I think we should have AbstractElement in FEMBase ...) The good question is how do we implement fields to an element in an efficient way so that Element is using only concrete types? Correct me if I'm wrong in my conclusion but it looks that the only way to get rid of memory allocation is to use concrete type in fields set. By the way, there has been some discussion should we use symbols or strings as field names. Clearly, symbol would be a better in the sense of performance: struct NewElement3 <: AbstractElement
fields :: Dict{Symbol, DCTI{Float64}}
end
element3 = NewElement3(Dict(:youngs_modulus => field_))
@btime interpolate($element3, :youngs_modulus, $ip, 1.0) results
|
I did some further studies in this framework, here it comes. struct NewElement4 <: AbstractElement
fields :: Dict{Symbol, AbstractField}
end
element4 = NewElement4(Dict(:youngs_modulus => field_))
@btime interpolate($element4, :youngs_modulus, $ip, 1.0)
# output
30.424 ns (1 allocation: 16 bytes) Just for comparison, we get 20 ns away from dictionary lookup time just by changing the key type of dictionary from a string to a symbol, so for sure this is what should be in future. struct NewElement5{T<:AbstractField} <: AbstractElement
fields :: Dict{Symbol, T}
end
element5 = NewElement5(Dict(:youngs_modulus => field_))
# output
11.729 ns (0 allocations: 0 bytes) Templating Element by field type naturally solves the memory allocation issue, but is not practical since we need to have the ability to store a different kind of fields, not just one kind. So what about, if we have several field sets in element? struct NewElement7 <: AbstractElement
dcti_fields :: Dict{Symbol, DCTI}
dvti_fields :: Dict{Symbol, DVTI}
end
struct NewElement8 <: AbstractElement
dcti_fields :: Dict{Symbol, DCTI{Float64}}
dvti_2_fields :: Dict{Symbol, DVTI{2, Vector{Float64}}}
other_fields :: Dict{Symbol, AbstractField}
end
function interpolate_dcti(element::AbstractElement, field_name, ip, time)
return element.dcti_fields[field_name].data
end
function interpolate_dvti(element::AbstractElement, field_name, ip, time)
return element.dvti_fields[field_name].data
end
function interpolate_dvti_2(element::AbstractElement, field_name, ip, time)
return element.dvti_2_fields[field_name].data
end
function interpolate_other(element::AbstractElement, field_name, ip, time)
return element.other_fields[field_name].data
end
u = Dict(:displacement => DVTI([1.0, 2.0], [3.0, 4.0]))
T = Dict(:temperature => DVTI(1.0, 2.0))
element7 = NewElement7(Dict(:youngs_modulus => field_), u)
element8 = NewElement8(Dict(:youngs_modulus => field_), u, T)
println("Element 7")
@btime interpolate_dcti($element7, :youngs_modulus, $ip, 1.0)
@btime interpolate_dvti($element7, :displacement, $ip, 1.0)
println("Element 8")
@btime interpolate_dcti($element8, :youngs_modulus, $ip, 1.0)
@btime interpolate_dvti_2($element8, :displacement, $ip, 1.0)
@btime interpolate_other($element8, :temperature, $ip, 1.0)
# output
Element 7
30.790 ns (1 allocation: 16 bytes)
21.993 ns (0 allocations: 0 bytes)
Element 8
11.363 ns (0 allocations: 0 bytes)
12.829 ns (0 allocations: 0 bytes)
31.314 ns (1 allocation: 32 bytes) Element 8 is performing quite well. So one possible strategy would be to use separate field sets for different kind of fields and write interpolate function for each of them separately. Then we could have some sort of common interpolate function, like function FEMBase.interpolate(element::NewElement8, field_name, ip, time)
haskey(element.dcti_fields, field_name) && return interpolate_dcti(element, field_name, ip, time)
haskey(element.dvti_2_fields, field_name) && return interpolate_dvti_2(element, field_name, ip, time)
haskey(element.other_fields, field_name) && return interpolate_other(element, field_name, ip, time)
return nothing
end
@btime interpolate($element8, :youngs_modulus, $ip, 1.0)
@btime interpolate($element8, :displacement, $ip, 1.0)
@btime interpolate($element8, :temperature, $ip, 1.0)
# output
24.192 ns (1 allocation: 16 bytes)
29.691 ns (0 allocations: 0 bytes)
54.593 ns (1 allocation: 32 bytes) But again, performance is lost because of type instability, we don't know what is the output type of the general struct NewElement9 <: AbstractElement
fields :: Dict{Symbol, Function}
end
function FEMBase.interpolate(element::NewElement9, field_name, ip, time)
return element.fields[field_name](ip, time)
end
youngs_modulus(xi, time) = 288.0
displacement(xi, time) = [1.0, 2.0, 3.0, 4.0]
temperature(xi, time) = [1.0, 2.0]
fields = Dict(:youngs_modulus => youngs_modulus, :displacement => displacement, :temperature => temperature)
element9 = NewElement9(fields)
@btime interpolate($element9, :youngs_modulus, $ip, 1.0)
@btime interpolate($element9, :displacement, $ip, 1.0)
@btime interpolate($element9, :temperature, $ip, 1.0)
# output
27.125 ns (1 allocation: 16 bytes)
55.021 ns (2 allocations: 128 bytes)
54.373 ns (2 allocations: 112 bytes) Ok, that seems to be also a bad idea. I'm open to all ideas about how to get rid of that type instability problem. I think it's still a very important aspect to have an option to make all variables depend on time or spatial coordinates. While in 99.9 % of cases e.g. Young's modulus is constant in a domain, we don't want to restrict it to be like that. We should make it possible to e.g. youngs modulus to follow some function which is defined in both time and spatial domain. |
Maybe here's the solution. Instead of having abstract type AbstractFieldSet end
struct FieldSet1{N<:Integer} <: AbstractFieldSet
geometry :: DVTI{N, Vector{Float64}}
displacement :: DVTV{N, Vector{Float64}}
youngs_modulus :: DCTI{Float64}
poissons_ratio :: DCTI{Float64}
end
function FieldSet1()
z = zeros(3)
geometry = field((z, z, z ,z))
displacement = field(0.0 => (z, z, z, z))
youngs_modulus = field(0.0)
poissons_ratio = field(0.0)
return FieldSet1{4}(geometry, displacement, youngs_modulus, poissons_ratio)
end
struct NewElement10{F<:AbstractFieldSet} <: AbstractElement
fields :: F
end
fs = FieldSet1()
element10 = NewElement10(fs)
#output
NewElement10{FieldSet1}(FieldSet1(DVTI{4,Array{Float64,1}}(([0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0])), DVTV{4,Array{Float64,1}}(Pair{Float64,NTuple{4,Array{Float64,1}}}[0.0=>([0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0])]), DCTI{Float64}(0.0), DCTI{Float64}(0.0))) Now, fields can be updated using update!(element10.fields.geometry, ([0.0,0.0], [1.0,0.0], [1.0,1.0], [0.0,0.1]))
update!(element10.fields.displacement, 1.0 => ([0.0,0.0], [1.0,0.0], [0.0,0.0], [0.0,0.0]))
update!(element10.fields.youngs_modulus, 288.0)
update!(element10.fields.poissons_ratio, 1/3)
element10.fields.displacement
#output
DVTV{4,Array{Float64,1}}(Pair{Float64,NTuple{4,Array{Float64,1}}}[0.0=>([0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]), 1.0=>([0.0, 0.0], [1.0, 0.0], [0.0, 0.0], [0.0, 0.0])]) Now function FEMBase.interpolate(element::NewElement10, field_name, ip, time)
field = getfield(element.fields, field_name)
return interpolate(field, time)
end
@btime interpolate($element10, :geometry, $ip, $time)
@btime interpolate($element10, :displacement, $ip, $time)
@btime interpolate($element10, :youngs_modulus, $ip, $time)
@btime interpolate($element10, :poissons_ratio, $ip, $time)
# output
1.466 ns (0 allocations: 0 bytes)
14.662 ns (0 allocations: 0 bytes)
1.832 ns (0 allocations: 0 bytes)
1.099 ns (0 allocations: 0 bytes) The setup seems to be a bit longer but we can do a macro to make it easier to create new fieldsets. We can also create a "default" field set having the most common fields, so if the field is not defined when creating an element, we use default and preserve backward compatibility. At least one drawback is that we cannot alter the field configuration inside a simulation, i.e. we need to know beforehand what fields we need before starting a simulation. This could be avoided by having a new style Another thing which is probably changing is how do we make changes for a single component of a vector function. Before we had update!(element, "displacement 1", 0.0) and similar, which is not working anymore if |
@ahojukka5 To me this looks like a very good proposal. How much of the existing code will have a problem with your proposal? |
I think changes would be quite trivial because at the same time we can move from strings to symbols. abstract type AbstractFieldSet end
abstract type AbstractElement end
struct FieldSet1{N,D} <: AbstractFieldSet
geometry :: DVTI{N, Vector{Float64}}
displacement :: DVTV{N, Vector{Float64}}
youngs_modulus :: DCTI{Float64}
poissons_ratio :: DCTI{Float64}
end
function FieldSet1(N,D=3)
geometry = field(ntuple(i->zeros(D), N))
displacement = field(0.0 => ntuple(i->zeros(D), N))
youngs_modulus = field(0.0)
poissons_ratio = field(0.0)
return FieldSet1{N,D}(geometry, displacement, youngs_modulus, poissons_ratio)
end
struct NewElement10{F<:AbstractFieldSet} <: AbstractElement
fields :: F
sfields :: Dict{String, AbstractField} # old-style slow fields
end
fs = FieldSet1(4)
element10 = NewElement10(fs, Dict{String,AbstractField}())
function FEMBase.update!(element::NewElement10, field_name::Symbol, field_data)
field = getfield(element.fields, field_name)
update!(field, field_data)
end
function FEMBase.update!(element::NewElement10, field_name::String, field_data)
if haskey(element.sfields, field_name)
update!(element.sfields[field_name], field_data)
else
element.sfields[field_name] = field(field_data)
end
end
function FEMBase.interpolate(element::NewElement10, field_name::Symbol, ip, time)
field = getfield(element.fields, field_name)
return interpolate(field, time)
end
function FEMBase.interpolate(element::NewElement10, field_name::String, ip, time)
field = getindex(element.sfields, field_name)
return interpolate(field, time)
end
update!(element10, :geometry, ([0.0,0.0], [1.0,0.0], [1.0,1.0], [0.0,0.1]))
update!(element10, :displacement, 1.0 => ([0.0,0.0], [1.0,0.0], [0.0,0.0], [0.0,0.0]))
update!(element10, :youngs_modulus, 288.0)
update!(element10, :poissons_ratio, 1/3)
update!(element10, "geometry_old", ([0.0,0.0], [1.0,0.0], [1.0,1.0], [0.0,0.1]))
update!(element10, "displacement_old", 0.0 => ([0.0,0.0], [0.0,0.0], [0.0,0.0], [0.0,0.0]))
update!(element10, "displacement_old", 1.0 => ([0.0,0.0], [1.0,0.0], [0.0,0.0], [0.0,0.0]))
update!(element10, "youngs_modulus_old", 288.0)
update!(element10, "poissons_ratio_old", 1/3)
ip = (0.0, 0.0)
time = 0.0
println("Old system")
@btime interpolate($element10, "geometry_old", $ip, $time)
@btime interpolate($element10, "displacement_old", $ip, $time)
@btime interpolate($element10, "youngs_modulus_old", $ip, $time)
@btime interpolate($element10, "poissons_ratio_old", $ip, $time)
println("New system")
@btime interpolate($element10, :geometry, $ip, $time)
@btime interpolate($element10, :displacement, $ip, $time)
@btime interpolate($element10, :youngs_modulus, $ip, $time)
@btime interpolate($element10, :poissons_ratio, $ip, $time)
# output
Old system
43.619 ns (1 allocation: 16 bytes)
70.994 ns (1 allocation: 16 bytes)
48.356 ns (2 allocations: 32 bytes)
48.503 ns (2 allocations: 32 bytes)
New system
1.466 ns (0 allocations: 0 bytes)
24.558 ns (0 allocations: 0 bytes)
1.832 ns (0 allocations: 0 bytes)
1.099 ns (0 allocations: 0 bytes) I found yet another advantage. Typically we need certain dimension information almost everywhere, like the number of basis functions or the dimension of the field. Now, these can be templated to We don't necessarily have to deprecate the old system at all. It's still having that advantage over new one that we can dynamically define new fields when necessary. Actually, I think what with a careful design we can use these both parallel such a way that if a field is explicitly defined using a new system, it will be fast and type stable, and if not, it will be the old slower system then. We could have The only think what is unclear for me is that how do we apply boundary conditions after a change as we have only one predefined field like julia> update!(element, "displacement 1", 1.0) what would mean that add prescribed displacement only for component 1 and leave rest as before, the new syntax would be julia> update!(element, :displacement, [1.0, NaN, NaN]) Actually even this is not a big problem if we leave the old field system, because there we can define fields dynamically without defining them explicitly at the beginning of the simulation. So even that is not a problem. |
I think a tuple with the index you want to change and the value clearly states your intentions There may be a case where [1.0, NaN, NaN] could be legal values. |
using FEMBase
using BenchmarkTools
abstract type AbstractFieldSet end
abstract type AbstractElement end
struct FieldSet1{N,D} <: AbstractFieldSet
geometry :: DVTI{N, Vector{Float64}}
displacement :: DVTV{N, Vector{Float64}}
youngs_modulus :: DCTI{Float64}
poissons_ratio :: DCTI{Float64}
end
function FieldSet1(N,D=3)
geometry = field(ntuple(i->zeros(D), N))
displacement = field(0.0 => ntuple(i->zeros(D), N))
youngs_modulus = field(0.0)
poissons_ratio = field(0.0)
return FieldSet1{N,D}(geometry, displacement, youngs_modulus, poissons_ratio)
end
struct NewElement10{F<:AbstractFieldSet} <: AbstractElement
sfields :: F # static fields given when creating a new element
dfields :: Dict{Symbol, AbstractField} # dynamics fields created after creating a new element
end
static_fields = FieldSet1(4)
dynamic_fields = Dict{Symbol,AbstractField}()
element10 = NewElement10(static_fields, dynamic_fields)
function has_sfield(element, field_name)
return isdefined(element.sfields, field_name)
end
function has_dfield(element, field_name)
return haskey(element.dfields, field_name)
end
function get_sfield(element, field_name)
return getfield(element.sfields, field_name)
end
function get_dfield(element, field_name)
return getindex(element.dfields, field_name)
end
function create_dfield!(element, field_name, field_data)
field_ = field(field_data)
T = typeof(field_)
@info("Creating new dfield $field_name of type $T")
element.dfields[field_name] = field_
end
function update_sfield!(element::NewElement10, field_name, field_data)
field = get_sfield(element, field_name)
update!(field, field_data)
end
function update_dfield!(element::NewElement10, field_name, field_data)
if has_dfield(element, field_name)
update!(get_dfield(element, field_name), field_data)
else
create_dfield!(element, field_name, field_data)
end
end
function FEMBase.update!(element::NewElement10, field_name::Symbol, field_data)
if has_sfield(element, field_name)
update_sfield!(element, field_name, field_data)
else
update_dfield!(element, field_name, field_data)
end
end
sfields = FieldSet1(4,2)
dfields = Dict{Symbol,AbstractField}()
element10 = NewElement10(sfields, dfields)
# update static fields
update!(element10, :geometry, ([0.0,0.0], [1.0,0.0], [1.0,1.0], [0.0,0.1]))
update!(element10, :displacement, 1.0 => ([0.0,0.0], [1.0,0.0], [0.0,0.0], [0.0,0.0]))
update!(element10, :youngs_modulus, 288.0)
update!(element10, :poissons_ratio, 1/3)
# update dynamic fields
update!(element10, :dgeometry, ([0.0,0.0], [1.0,0.0], [1.0,1.0], [0.0,0.1]))
update!(element10, :ddisplacement, 0.0 => ([0.0,0.0], [0.0,0.0], [0.0,0.0], [0.0,0.0]))
update!(element10, :ddisplacement, 1.0 => ([0.0,0.0], [1.0,0.0], [0.0,0.0], [0.0,0.0]))
update!(element10, :dyoungs_modulus, 288.0)
update!(element10, :dpoissons_ratio, 1/3)
function interpolate_dfield(element::NewElement10, field_name::Symbol, ip, time)
field = get_dfield(element, field_name)
return interpolate(field, time)
end
function interpolate_sfield(element::NewElement10, field_name::Symbol, ip, time)
field = get_sfield(element, field_name)
return interpolate(field, time)
end
function FEMBase.interpolate(element::NewElement10, field_name::Symbol, ip, time)
if has_sfield(element, field_name)
return interpolate_sfield(element, field_name, ip, time)
elseif has_dfield(element, field_name)
return interpolate_dfield(element, field_name, ip, time)
else
error("Cannot interpolate from field $field_name: no such field.")
end
end
ip = (0.0, 0.0)
time = 1.0
println("Interpolate dynamic fields")
@btime interpolate($element10, :dgeometry, $ip, $time)
@btime interpolate($element10, :ddisplacement, $ip, $time)
@btime interpolate($element10, :dyoungs_modulus, $ip, $time)
@btime interpolate($element10, :dpoissons_ratio, $ip, $time)
println("Interpolate static fields")
@btime interpolate($element10, :geometry, $ip, $time)
@btime interpolate($element10, :displacement, $ip, $time)
@btime interpolate($element10, :youngs_modulus, $ip, $time)
@btime interpolate($element10, :poissons_ratio, $ip, $time)
println("Interpolate static fields with interpolate_sfield")
@btime interpolate_sfield($element10, :geometry, $ip, $time)
@btime interpolate_sfield($element10, :displacement, $ip, $time)
@btime interpolate_sfield($element10, :youngs_modulus, $ip, $time)
@btime interpolate_sfield($element10, :poissons_ratio, $ip, $time)
# output
Interpolate dynamic fields
37.022 ns (1 allocation: 16 bytes)
56.591 ns (1 allocation: 16 bytes)
42.886 ns (2 allocations: 32 bytes)
42.886 ns (2 allocations: 32 bytes)
Interpolate static fields
1.466 ns (0 allocations: 0 bytes)
19.793 ns (0 allocations: 0 bytes)
1.466 ns (0 allocations: 0 bytes)
1.466 ns (0 allocations: 0 bytes)
Interpolate static fields with interpolate_sfield
1.099 ns (0 allocations: 0 bytes)
19.060 ns (0 allocations: 0 bytes)
1.466 ns (0 allocations: 0 bytes)
1.466 ns (0 allocations: 0 bytes) Looks like a solution for me. |
Basically, the problem is that the field is just tuple of vectors. For example, for 4-node tetrahedron, it is In a bigger scale, we should implement boundary conditions such a way that we can write equality constraints to couple fields together. Once I had an idea to implement some sort of "symbolic" sparse matrix such a way that instead of giving row and column indices, we give a tuple where the first item is node name/number and the second item is dof name. This can then be "flattened" to a standard sparse matrix, given some mapping. This could in principle solve all the problems related to giving boundary conditions, no matter how complex they are. |
On the same note, you want to state that nodes are equal, node 500 is node 5000 in a different node generator sequence. Sort of same as coupling elements in and out of the system along the timeline. |
The current
interpolate
has a bit too much overhead to be acceptableFor example
Benchmarking this gives
This is a bit long and there are two allocations.
I tried to store the field using two vectors, instead of a dictionary with the assumption that the number of keys are small with moderate success. Using d54d5d3 I got
The text was updated successfully, but these errors were encountered: