-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JLD2OutputWriter
and Checkpointer
don't work when max_filesize
and part
are specified.
#3399
Comments
The intended user experience is that only one line should need to be changed: Therefore, users should not have to manually specify the "part" that they want to pick up from. I don't like option 2 above. I think that fixing this problem may become much easier if we can "delay" the creation of the output file. Right now, the output file is created when we build the output writer. But at that point, we have no way of knowing whether we are going to pick up or not. I've long wanted to implement this "delay" but more pressing matters have intervened... The basic thing we need to do is to add an With that feature I think we can also figure out how to handle output that is split into multiple files --- because we know if a simulation is continuing that we will have to figure out which
This is a separate feature from what I was talking about, but I think it's also a great idea! There also may be a clue how to solve a roundoff error issue, where two outputs are written one iteration separate from one another, but at virtually identical times (eg distinguished only by machine epsilon). PS: I simplified the example a bit to help me understand it |
Why do we even have the "part" kw for JLD2OutputWriter? I feel this is a weird detail and users should not have to set that. |
Below is a minimal working example of the problem:
What I'm doing is creating a directory
test_outputwriter
, and then writing fields into it with a specified file size and starting part number.After the first
run!(simulation)
, 4 output files were written, most recent beinginstantaneous_fields_part4.jld2
, and a checkpoint filemodel_checkpoint_iteration0.jld2
is written.Let's say I want to keep running this model, so I increase
simulation.stop_iteration
. I pick up the model from the most recent checkpoint, and specifypart=4
(the most recent file written). This creates ainstantaneous_fields.jld2
and keeps writing into it, while throwing a warningIt never actually writes into
instantaneous_fields_part4.jld2
, and it keeps writing and rewriting intoinstantaneous_fields.jld2
. If instead I specifypart=10
or any number larger than 4, the same problem occurs.If I use
part=1
in my 2nd spin up of the simulation, it throwsNot sure what the intended user experience but I was imagining that if for some reason the simulation stops and I want to rerun the simulation from a checkpoint, 2 potential options would be available:
The text was updated successfully, but these errors were encountered: