Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write a DiffractionObject serializer/deserializer #195

Open
sbillinge opened this issue Dec 4, 2024 · 3 comments
Open

write a DiffractionObject serializer/deserializer #195

sbillinge opened this issue Dec 4, 2024 · 3 comments
Milestone

Comments

@sbillinge
Copy link
Contributor

Problem

It would be great if we could dump and load DiffractionObjects as DiffractionObjects

Proposed solution

either use json, but maybe better, figure out how to do it with an hdf5 file to preserve the numpy arrays correctly, but also all the metadata.

Maybe save the file with a dfpydo extension or sthg like that. This would allow DiffractionObjects to be shared more easily, with all their metadata.

@bobleesj
Copy link
Contributor

Good comments by @sbillinge from #222

I think we probably want code that will dump to different file formats and read from different formats. These are dumpers and loaders. If we serialize and deserialize things I think of it as a kind of (safe) round trip. the unsafe things is just to pickle and then unpickle but we need to be more careful now. But sinnce we want people using diffraction objects, the serializers that most interesting to me would be ones that serialize and deserialize complete diffraction objects.

We definitely want the dumpers and loaders and a pure serializer. We can put it off to 3.7 though to keep momentum. The serializer will change the API though because I would like to give as an option a file path, or probably better a stream so we can have different backends, and have it load the DO from there. So the serializer would be nice now I guess?

Dumpers and loaders to group standards file formats as we have are a bit specific, but probably worth having since the large community is using these file formats

We could re-visit this once we have 3.6.0 milestone delivered?

@bobleesj
Copy link
Contributor

bobleesj commented Dec 17, 2024

Recording today's in-person discussion with @sbillinge

  • Serialization takes a DiffractionObject (DO) and outputs it to a file (.txt, .json, etc.), which can be imported into databases and shared via email with colleagues. There is no loss of information.
  • Deserialization uses the the serialized file to instantiate a DO object.

How is this different from using a loader/dumper?

  • We want serialized DOs to be usable by other existing software, e.g., GSAS.
  • A loader/dumper saves a file in a format that the existing software can read.
  • There is some information loss, but it's a much better solution than not being able to import a DO into these packages or waiting for these packages to support our format, which may not happen soon.

@bobleesj
Copy link
Contributor

Note to myself: two UC provided by @sbillinge in using DiffractionObject with empty arrays for xarray and yarray: #195

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants