moar docs!

Nexus-Mods · Jan 19, 2024 · 43eca56 · 43eca56
1 parent e6253e4
commit 43eca56
Show file tree

Hide file tree

Showing 5 changed files with 88 additions and 53 deletions.
diff --git a/NexusMods.EventSourcing.sln b/NexusMods.EventSourcing.sln
@@ -11,7 +11,6 @@ Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = ".solutionItems", ".solution
 		Directory.Build.targets = Directory.Build.targets
 		NuGet.Build.props = NuGet.Build.props
 		icon.png = icon.png
-		mkdocs.yml = mkdocs.yml
 	EndProjectSection
 EndProject
 Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "src", "src", "{0377EBE6-F147-4233-86AD-32C821B9567E}"

diff --git a/docs/AdaptingToChanges.md b/docs/AdaptingToChanges.md
@@ -0,0 +1,43 @@
+---
+hide:
+  - toc
+---
+
+## Adapting to Changes
+The EventSourcing framework is designed to be flexible and allow for changes to the data model. However, there are some changes
+that are not easily support.
+
+### Event Replaying
+A key component to the EventSourcing framework is the ability to replay events. Since the logic for these events is defined
+in code, this code can change at any time redefining how the events are processed. The arguments to the event (the event data)
+cannot change over time as the datastore is immutable, but the interpretation of that data can change. The attributes emitted
+by an event can completely change, and as long as the events are updated to match, the system will continue to work.
+
+### Snapshotting
+Entities are routinely snapshotted when read to improve performance. The problem with this approach is that they essentially
+"bake in" the logic of the events at the time of the snapshot. This means that if the logic of the events changes, the snapshot
+will have to be recreated. Due to this, each snapshot records not only the Entity type id, but also its revision. To invalidate
+snapshots of an entity, simply increment the revision number on the entity, this will invalidate all snapshots on the next read of the entity
+and the events for that entity will be replayed. In small batches this should not be a performance problem as reading events is
+fairly inexpensive. Once the entity is re-read a new snapshot with the new revision will be created.
+
+
+## Example problems and solutions
+
+### Example 1: Changing an attribute type
+Lets say you have a `File` entity that has a `.Size` attribute. You realized that someone set that size to `uint` and now the
+system breaks because a file is over 4GB. The event that creates this file `CreateFile` has a `uint` parameter on it. As mentioned
+above, event data cannot change, so you cannot modify the `uint` on the event and turn it into a `ulong`.
+
+Instead, first update the `File` entity to have a `ulong` for the `.Size` attribute, and increment the revision number.
+Then create a new event `CreateFileV2` that that has a `ulong` parameter for the size. Now go to the definition for `CreateFile`
+and modify the `Apply` method to convert the `uint` parameter to a `ulong` when emitting the size.
+
+Now the old events will replay correctly (as the `.Apply` method will convert the `uint` to a `ulong`), and any new events
+will emit the `CreateFileV2` event with the correct `ulong` size. Incrementing the revision number on the `File` entity will
+cause all snapshots to be invalidated and the new events will be replayed during the next load.
+
+### Example 2: Renaming a Entity
+You have an entity named `File`, but need to now call it `ArchiveFile`. All entities have a `Entity` attribute that provides
+a unique identifier for the entity. So all you need to do is modify the C# name of the entity, and nothing else needs to change. This
+also applies for moving an entity to a different namespace.
diff --git a/docs/SecondaryIndexes.md b/docs/SecondaryIndexes.md
@@ -0,0 +1,42 @@
+---
+hide:
+  - toc
+---
+
+## Secondary Indexes
+
+Secondary indexes are a way to query an entity based on the most recent value of a given attribute. The "most recent value"
+part of this creates a rather interesting problem: since the system can view the data at any transaction time, the indexes
+must support the changing of values over time.
+
+!!! info : "Indexing entities based on their EntityId is implemented with the same logic as secondary indexes."
+
+The primary index for information in the system is a sorted list of all transactions. With each transaction having a monotonic
+increasing transaction time, this allows for any transaction to be found with a single lookup in the event store. This stream
+of data is then indexed by the secondary indexes to provide a quick way to lookup the transactions that influence a given
+entity. Since all transactions are sorted by their transaction id, and the transaction id is a monotonically increasing value,
+we can simply replay any matching events to get the current value of any attribute of an entity.
+
+There are two primary ways to index attribute values:
+
+### Collection Attributes
+Attributes on an entity can be collections. A Loadout may include a list of mods, a mod may contain a list of files, etc. These
+attributes can be updated to track a certain value on the child entity. So a Loadout may have a dictionary lookup Mod.Name to mod entity,
+or a mod may have a lookup via the path to a file. This allows for a quick lookup of the child entity based on the parent entity.
+Since the system also supports graph data structures, a mod could also link back to the loadout. While this is a very simple approach,
+it does require all of the keys (and matching EntityIDs) to be loaded into memory. This is not a problem for small collections, but
+for something like files in an archive, this could be a problem. So collection attributes should be prefered for small collections
+or when the whole dataset will likely be loaded into memory anyway.
+
+### Indexed Attributes
+Some attributes are marked with `IIndexableAttribute`. This allows for the attribute to be indexed by the system. These attributes
+must be scalar values, and their matching lookup properties must be marked with `IndexedAttribute`. When this occurs the system
+will create a secondary index on that attribute. This index is a sorted composite index of `[attribute, value, transactionId]`. This allows
+for all the events that contain a given attribute and value combination to be replayed in order. From there the transactions can be
+replayed to find entities with a matching entity type, those entities can then be loaded, and the attribute can be checked to see if
+it is infact the correct value.
+
+!!! info : "Performance will degrade if an indexed attribute's value changes often. So try to keep indexes on attributes that are unlikely to change. Index a file's hash for example, but not the mod count in a loadout."
+
+!!! warning : "Indexed attributes are not versioned, cannot be recreated. If the datamodel radically changes, the index will have to be deleted and *all* events in the store replayed to recreate it."
+
diff --git a/mkdocs.yaml b/mkdocs.yaml
@@ -48,4 +48,6 @@ theme:
 
 nav:
   - Home: index.md
-  - Usage: Usage.md
+  - Adapting to Changes: AdaptingToChanges.md
+  - Secondary Indexes: SecondaryIndexes.md
+
diff --git a/mkdocs.yml b/mkdocs.yml