Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization overhaul #830

Open
DanRStevens opened this issue Mar 12, 2021 · 8 comments
Open

Serialization overhaul #830

DanRStevens opened this issue Mar 12, 2021 · 8 comments
Labels
complex Task that requires significant understanding of the code base and may have wide reaching changes task Task to be completed.

Comments

@DanRStevens
Copy link
Member

The save/load code has a lot of direct calls to the XML code in NAS2D. There is a plan to update the NAS2D XML code from TinyXML1 (with local modifications and renames) to TinyXML2. Such an upgrade will break a lot of existing code. We should try to minimize the number of direct calls to XML code to ease the transition.

References:


As much of the save/load code needs to save and restore arbitrary data, much of it works with XmlElement objects. Instead, we could potentially have objects return a Dictionary of data that needs to be saved, or to accept a Dictionary of data to process when loading.

Reference: https://github.com/lairworks/nas2d-core/blob/master/NAS2D/Dictionary.h

We could have methods that convert Dictionary objects to/from XmlElement nodes. For loading code, the ParserHelper methods can be used to validate the presence of required fields. It can also be used to detect extra unknown fields.

Reference: https://github.com/lairworks/nas2d-core/blob/master/NAS2D/ParserHelper.h

Using these objects may allow for simplifications in the loading code. It should also allow the XML references to be lifted to a higher level and reduced in number.

@ldicker83
Copy link
Collaborator

I'm game for using a Dictionary type object. Without looking too deep into the Dictionary class, can std::tuple's be used? Because in most cases I can see basic key/value pairs but I'm wondering about the elements that have multiple attributes. Would it make sense to use a std::tuple for this or could it still be done with key/value pairs?

I suppose that's all down to the implementation details but I'm totally in favor of doing it this way. Not having to deal directly with XML processing would simplify so many things.

@DanRStevens
Copy link
Member Author

The underlying code uses stringTo<T> from StringUtils.h. That method doesn't currently support conversion to collection types.

We could potentially upgrade stringTo. Though the other solution is to parse strings outside of that method, with manual calls to split for std::vector type results.

@DanRStevens
Copy link
Member Author

All except one use of the XML attributes method has been replaced with uses of Dictionary. That goes a long way towards decoupling from the specific of the current XML implementation.

The TileMap::serialize method probably needs to be split. Right now it takes in an XmlElement to attach things to, and it generates several to attach. Other methods generate a single new XmlElement object to return, which the caller is required to attach somewhere. Splitting the TileMap::serialize method into individual parts (view parameters, mines, tiles), would allow each section to return a single element, and match the style of the other methods.


Now that attributes have been largely dealt with, some consideration needs to be given to the linkEndChild calls. Perhaps we can abstract away the tree like structure. Indeed, such a structure would apply to other formats too, such as JSON, or YAML. It wouldn't apply so well to an INI file though.

Another approach we could take is to flatten the saved game file structure a bit. That may make a lot of sense in some cases. I don't know that it would apply everywhere though.

One downside of a structural change from flattening the saved game data, is the affects it would have on saved game file compatibility. This can be partially mitigated by providing an upgrade path where certain versions of the game can load old files and re-save them in a new format. Though I'm hesitant to put too much effort into that when we are still reasonably far from a v1 release.

@Brett208
Copy link
Member

I wouldn't be too worried about compatibility yet considering it is tough to play for more than an hour tops right now with the current content.

I had thought splitting tilemap serialization into smaller functions was a good idea as well.

@ldicker83
Copy link
Collaborator

ldicker83 commented Jun 28, 2021

One downside of a structural change from flattening the saved game data, is the affects it would have on saved game file compatibility. This can be partially mitigated by providing an upgrade path where certain versions of the game can load old files and re-save them in a new format. Though I'm hesitant to put too much effort into that when we are still reasonably far from a v1 release.

I'm really not worried about savegame compatibility at this point. As Brett pointed out, there's not yet much to lose other than a bit of time.

Now would be the time to make the changes especially with the research tree coming due for development where users will want to actually save their games and come back to it (actual progress can be made at that point). So yeah, now's the time to make breaking changes.

@DanRStevens
Copy link
Member Author

I've been looking back at this. We've removed most direct uses of XmlAttribute and replaced them with more generic Dictionary objects. That worked out well for decoupling from XmlAttribute, where they both represent flat structures.

We still have a fair bit of code in OPHD that directly uses XmlElement objects. These are non-flat structures, and so don't abstract well to Dictionary objects. I think we need something more, that can represent a hierarchy of data. For a single level we've been able to get away with a std::map of Dictionary, but for deeper data nesting we need a new data type with proper recursive support. Without that it will be difficult to decouple from XmlElement.

In terms of XmlElement specific operations, most of them are navigating to a specific unique sub-element, or to iterate over a collection of similar sub-elements. For any given level, we can get a Dictionary of attributes. We also have a bit of verification code, checking for the existence of sub-elements, or that no unexpected sub-elements exist.

@DanRStevens
Copy link
Member Author

I've noticed rather repetitive error reporting code, that reports a file line number and column that lead to a parse error. Potentially we could abstract away that error reporting too.

Currently most parse code is driven by data format specific functions, which make calls into XML handling objects. This means when that custom code detects an error, it must also include custom error reporting code. Instead, we could potentially invert things, and have a generic parse function that takes a parse method as a parameter, and can call that method with a new data type roughly representing an XmlElement object. The parse method can either return parsed data, or throw, in which case the generic parse method can attach file position information to the exception. This removes the burden of duplicating error reporting code everywhere.


Example:

const auto parsedData = parseFile("filename.xml", parseFunction);
auto parseFunction(const DataStore& genericRecursiveDataStoreObject)
{
  // ...

  if (isError)
  {
    throw std::runtime_error("Something very specific and bad happened");
  }

  return data;
}

Error parsing "filename.xml": Something very specific and bad happened (line 7, column 4)


Or maybe an error occurs while parsing a sub-object instead. Example:

auto parseFunction(const DataStore& genericRecursiveDataStoreObject)
{
  // ...
  const auto subData = genericRecursiveDataStoreObject["subObjectSectionName"].parse(subObjectParseFunction);
  // ...
}

auto subObjectParseFunction(const DataStore* genericRecursiveDataStoreObject)
{
  // ...
  if (isError)
  {
    throw std::runtime_error("Problem parsing SubObject");
  }
  // ...
}

Error parsing "filename.xml": Problem parsing SubObject (line 12, column 8)

@DanRStevens
Copy link
Member Author

Continuing the above idea, to do automatic error reporting, the parsing needs to be wrapped in generic library methods that handle the actual XML data. That means passing callbacks into this methods so they can parse the data into arbitrary data structures.

To implement such generic library methods, they will need a templated return type, as they won't know what specific data structure will be parsed. They will also need a parsing function parameter, with that return type, which could be a templated callable object.

It may be tempting to use std::function for the parsing function, as it will have a fixed parameter list, but then there would be no way to handle the arbitrary return type.

One downside to using a templated method, is the implementation would need to be placed in the header file, and any code used by the implementation would need corresponding #include lines, also placed in the header file.


Alternatively, we could design an API with no return type, and leave extracting the data to some other means. Presumably a lambda could capture an object it needs to feed the data to, and writing that data could become part of the parse function. That seems like a little bit less clean of a concept though.

const auto parseFunction = [&object](const DataStore& genericRecursiveDataStoreObject) {
  object.data = ... // Parse and assign some data
};

parseFile("filename.xml", parseFunction);

As another possible alternative, maybe the throw can be a library method that's part of the API. Pass it an error message, and it will attach appropriate file location information, suitable for the current section of the file.

const auto rootSection = parseXml("filename.xml");

const auto version = VersionString{rootSection.get("version")};
if (version < "1.0.0") {
  rootSection.throw("Expected version 1.0.0 or greater. File version: " + version);
}

const auto dataSection = rootSection["data"];
// ...
if (...) {
  dataSection.throw("Missing required attribute: size");
}

Possible error outputs:

Error parsing "filename.xml": Expected version 1.0.0 or greater. File version: 0.9.9 (line 1, column 1)

Error parsing "filename.xml": Missing required attribute: size (line 2, column 4)

I think I may want to explore this last possibility, as it may be the most natural to many programmers, avoids putting implementation in templated methods in header files, and still provides support for generic parsing error messages.


Reported file positions in error messages might have limited granularity, perhaps indicating the section, rather than a specific attribute or sub element. For instance, there might be a "root" XmlElement, and all errors directly in that section would be reported with the line and column number of the start of the "root" XmlElement, rather than say, the position of an extra unexpected attribute on the "root" XmlElement.

@ldicker83 ldicker83 added complex Task that requires significant understanding of the code base and may have wide reaching changes task Task to be completed. labels Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
complex Task that requires significant understanding of the code base and may have wide reaching changes task Task to be completed.
Projects
None yet
Development

No branches or pull requests

3 participants