Skip to content

Commit

Permalink
Start updating docs
Browse files Browse the repository at this point in the history
  • Loading branch information
halgari committed Mar 4, 2024
1 parent 966fe1c commit 7164834
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 233 deletions.
35 changes: 35 additions & 0 deletions docs/IndexFormat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
hide:
- toc
---

## Index Format

The index format of the framework follows fairly closely to one found in Datomic, although differences likely exist due it having
a different set of constraints and requirements. The base format of the system is a sorted set of tuples. However each tuple
exists in multiple indexes with a different sorting and conflict resolution strategy for each.

In general data flows through 4 core indexes:

* TxIndex - A sorted list of all transactions, this isn't a index per-se, but it is the primary source of truth. It's key-value lookup where
the key is the transaction id and the value a block of all the tuples in that transaction. The storage system is expected to be able to
store new blocks, get a block by id, and to get the highest id in the store. Based on this the system can build the rest of indexes.
* InMemory - This is a single block of all the tuples that have been added to the TxLog but not yet been merged with the other indexes
* Historical index - This is a large index of every tuple that has ever been added to the system. Naturally this means that a search for a value
asof a given T value can be done by searching for a matching tuple with a tx value equal to or less than a given T. This means that
at times finding the recent value may be O(n) where n is the number of transactions for the given attribute.
* Current index - This index is a sorted list of all the current value of every attribute for every entity. There isn't historical data here,
but the Txvalue is recorded for each tuple so that queries can filter out values that are newer than the query time. This index is built
under the assumption that the vast number of queries will be on a relatively recent basis time.

When we say the tuples are "sorted" the next question is "sorted by what?" The answer is that the InMemory and Historical and current
indexes are sorted in several ways, so that queries can efficently find the data they need.

* EATV - This index is used to find all the tuples for a given entity. Most often used for questions of "what is the state of this entity?"
* AETV - This index is used to find all the tiples for a given attribute. Most often used for questions of "what entities have this attribute?"
* AVTE - This index is used to find all the entities for a given attribute and value. Most often used for questions of "what entities have this attribute with this value?", this index
can be expensive to maintain, so it is opt-in, but enabled by default for attributes where the type of the attribute is a reference to another entity.
* VATE - This index is used to find all the attributes or entities that have a given value. Most often used for backreference queries such as "what entities point to this entity?", to save space
this index is also opt-in, but enabled by default for attributes where the type of the attribute is a reference to another entity.

## Data Structure
238 changes: 5 additions & 233 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ to track modifications to data and provide ways of auditing or undoing changes.

The term "Event Sourcing" was coined by [Martin Fowler in 2005](https://martinfowler.com/eaaDev/EventSourcing.html), and is described as:

!!! info "Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we query these events, we can also use the event log to reconstruct past states, and as a foundation to automatically adjust the state to cope with retroactive changes."
!!! info "Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we
query these events, we can also use the event log to reconstruct past states, and as a foundation to automatically adjust the state to cope with retroactive changes."

These features solve several problems we experience in the Nexus Mods App, namely:

Expand Down Expand Up @@ -79,236 +80,7 @@ Data is stored as a series of tuples, in the format of `[entity, attribute, valu
It is interesting to note that the `transaction` id is a 64bit long, and is used to order the tuples in the event log, transactions are monotonic and always increasing.
This is a key feature of the system, as it allows us to order the events in the event log, and also allows us to replay the events in the event log in order to rebuild the read data model.

Attributes and transactions are also entities, and are put into a separate `partition` via a prefix on their ids. The top byte of an ID is the partition, and by default the following partitions are defined:
Attributes and transactions are also entities, and are put into a separate `partition` via a prefix on their ids. The top byte of an ID is the partition, the actual value of these partition
prefixes don't matter much, but it should be noted that the first partition is the partition for attributes, so at any time a quick check of the first byte in an entity id can tell us if it's an attribute id or not.

* `0x00` - The attribute partition
* `0x01` - The transaction partition
* `0x03` - The tempid partition (used for assigning temporary ids to entities that will be resolved to actual IDs when a transaction is committed)
* `0x04` - The entity partition
* `0x05`+ - Unused, could be used for user defined partitions

Data is stored in RocksDB, with the value of each entry being mostly unused. Instead the same key is interested into several column families, each with a separate comparator. This same comparator
is used during query to perform a binary searh to a specific value. From that point the iterator can be used to move forwards and backwards in the column family to find the specific tuples.

### Key format
Keys in rocksdb are slightly bitpacked to save space. More advanced packing is possible, but the more packing tha that is performed, the more complex the comparison code becomes,
so it's a tradeoff between space and speed. The format of the key is as follows:

`[Entity(ulong),Transaction(ulong), Op + Attribute(ushort)]`

The top bit of the `ushort` value is the op (1 = Assert, 0 = Retract), and the remaining 15 bits can be cast to a ulong to get the Attribute entity ID. This means that Entity and transaction space are limited to 2^58,
and the attribute space is limited to 2^15. It is assumed that 32K distinct attributes is enough for anyone.

!!!info
A prototype was done where the entity and transaction values were 34bit unsigned integers, the attribute was 11bits, with a 1 bit op. This resulted in 2^11 distinct attributes, and a 10byte key, instead of an 18byte key in the final design.
However, looking at the generated machine code, it was clear that the key comparison code involved a lot more bit shifting and unaligned memory acces, and so the complexity was considered not worth the 8 byte saving.

### Indexes
There are several indexes that are required for the system to function properly:

But first let's define some common abbreviations:

* `E` - Entity id
* `A` - Attribute id
* `V` - Value blob
* `T` - Transaction id
* `O` - Operation flag

* TxLog - this is simply all the datoms in the system, ordered by transaction id, this can be used to ask questions like "What was inserted in the last transaction"
* EATOV - this is the primary index, and is used to find all the tuples associated with a specific entity, this can be used to ask questions like "Find all the attributes set on this mod"
* AVTEO - this is a reverse index, and is used to find all the entities that have a given value for a given attribute, this can be used to ask questions like "Find all mods that are enabled"
* VAETO - this is a reverse index, and is used to find all the entities that have a specific value. Normally this is only used on attributes who's value is of type `entity reference`. This can be used to ask questions like "Find all mods that point to this loadout"

### Sorting
Since all the sorting is performed by C# code injected into RocksDB via the `Comparator` interface, (via native function calling), the sorting follows C#'s rules for sorting. This means
we can use C#'s `IComparable` interfaces and define per-value-type sorting rules. This is useful for things like string and filepath sorting. The stored in each key in RocksDB is the same across all column families,
it's only the comparator that changes which results in a different sorting order.

### Querying
Internally, RocksDB offers a "SeekPrev" operator which says: "Find the first key that is less than or equal to this key". This allows us to construct a tuple with each value set to either a minimum or max value, and then use the SeekPrev to find the
next matching tuple.

For example, if we want to load all the attributes for Entity Id 42 as of transaction 100, we can construct a tuple with the following values:

`[42, ulong.MaxValue, null, 100, 1]`

Then we would `SeekPrev` to find the first matching tuple, and return it. Then we would take the attribute Id from that tuple, and update the tuple's attribute to the attribute id we just got, minus 1, and then `SeekPrev` again.
We continue this process until we find a tuple that doesn't match the entity id, or we run out of tuples.

Since 'Retract' has a flag of 0, and we don't allow a assert and retract in the same transaction, we will always get the correct value for the attribute at the given transaction. Also since the transaction id is monotonic, we can stop the search when we find the first transaction id
that is less than the transaction id we are looking for.

!!!info
Observant readers will realize this means that value trashing (values that change a lot) will result in lots of seeks for a single entity. If this becomes a performance problem we can borrow a concept from Datomic and have a `Current` index
which is a separate index that only contains the most recent assertions for each entity, this can then be filtered by Tx first, and any tuples that are more recent than the Tx we are looking for can be looked up in the `History` index. This would result
in a single seek for eacn entity, and then a linear scan of the attributes for that entity, with the fallback to the `History` index if the `Current` index doesn't contain the value we are looking for.

### Schema
As mentioned attributes and transactions are entities. This means we can add additional data to these entities, such as `TransactionTime` to a transaction, or `NativeType` to attributes. Not much more to say about this feature here, except that the schema
from the C# code is injected into the database itself so that we can re-query it on application startup to validate that code changes are compatible with the database schema.


## Core Interfaces

### IDatomStore
This interface isn't often encountered by users, but it's the core interface for inserting and reading datoms. It calls RocksDB and ensures that the data is properly indexed. It also offers basic iterators for reading each index. If we ever implement
a `Current` index, it will be implemented here.

### IConnection
Represents a writable connection to a database. This is named `Connection` mostly because that's what Datomic calls it, and it's the closest thing to a database connection in the system. This is all in-process so there's no network involved.
This class can create transactions, and dereference to a database. It has a `IDb Current {get;}` property that returns the most recent copy of the database, which is really just an `IDb` with the `Tx` value set to the most recent transaction id.

### IDb
An immutable database. This is the main interface for querying data from the database, queries from this interface will always return the same data. If more recent data is required, a new `IDb` must be created from a `IConnection` with a new transaction id.

## ITransaction
Created when a user calls `IConnection.BeginTransaction()`. This is a mutable object that can be used to insert datoms. When the transaction is commited, it will return a new `IDb` with the new transaction id, and a resolution function for resolving tempids to real ids.

Example:

Let's say we have an Entity structure named `Mod` and we're creating a new mod, we would construct this entity, passing in the transaction. Internally the entity would be assigned a tempid,
and that value will be returned by the `Insert` method. When the transaction is committed, that tempId can be resolved to a real id:

```csharp

using var tx = conn.BeginTransaction();
var mod = new Mod(tx)
{
Name = "My Mod",
Enabled = true
};

var file = new ModFile(tx)
{
Path = "C:\\",
Mod = mod.Id,
};
var resultSet = tx.Commit();

var db = resultSet.NewDb; // Could also call conn.Current
db[resultSet[mod.Id]].Name; // "My Mod", as resultSet has resolved the tempid to a real id assigned during the commit
```

## Entity Models
There are several types of models in this framework to handle the 3 main uses of grouping attributes into entities:

* Read Model - A collection attributes into a readonly object. For example, a `Mod` entity might have a `Name` and `Enabled` attribute, and so we'd need a `Mod` entity would be a read model.
* Write Model - A init-only collection of attributes into a writeable object. This is used to create new entities, and is used in the `ITransaction` interface.
* Active Read Model - A collection of attributes into a readonly object, but one that retains a reference to the connection that created it, as new data is written to the database, the active read model filters
this data and emits `INotifyPropertyChanged` events to any listeners. This is used to create a live view of the database

## Defining Code Models

The code model system in this framework uses code generation to create the attributes and read models

### Attributes
Attributes are defined by creating a static class and giving it a name that helps define the namespace for the attributes:

```csharp
namespace NexusMods.Model;

public partial static class Mod
{
public static readonly AttributeDefinitions AttributeDefinitions =
new AttributeDefinitionsBuilder()
.Define("Name", NativeType.String, "A name for the mod")
.Define("Enabled", NativeType.Bool, "True if the mod is enabled")
.Build();
}
```

The code generator will find this class, and generate attributes classes for the specified definitions. These will be generated as `NexusMods.Model.Mod/Name` and `NexusMods.Model.Mod/Enabled`. For
the name and namespaces respectively.

!!!info
The code generator will create a matching static DI method called `AddMod` that will add DI entries for the attribute definitions and the attribute classes.


### Read Models
Read models are defined much in the same way as attributes, as a collection of attributes:

```csharp
namespace NexusMods.Model;

public partial static class Mod
{
public static readonly AttributeDefinitions AttributeDefinitions =
new AttributeDefinitionsBuilder()
.Define<string>("Name", "A name for the mod")
.Define<bool>("Enabled", "True if the mod is enabled")
.Build();

public static readonly ModelDefinition ModelDefinition =
new ModelDefinitionBuilder()
.Include<Enabled, Name>()
.Build();
}
```

Internally this will generate 3 classes, for the `ReadModel`, `WriteModel`, and `ActiveReadModel` respectively. It will also generate several static extension methods for the `IDb` and `ITransaction` interfaces to make it easier to work with these models.
The write model method isn't a class, but a method that takes a `ITransaction` and returns a `ReadModel` interface.

```csharp

using var tx = conn.BeginTransaction();
var mod = tx.NewMod(
Name = "My Mod",
Enabled = true
);

var file = tx.NewModFile(
Path = "C:\\",
Mod = mod.Id,
);
```

For querying, several extension methods are generated for the `IDb` and `IConnection` interfaces:

```csharp

var results = db.GetMod(id);
var activeResults = conn.GetActiveMod(id);

var mods = from mod in db.GetMods()
where mod.Enabled
select mod;

foreach (var mod in mods)
{
Console.WriteLine(mod.Name);
}
```

Read models can also include other read models:

```csharp

namespace NexusMods.Model;

public partial static class NexusMod
{
public static readonly AttributeDefinitions AttributeDefinitions =
new AttributeDefinitionsBuilder()
.Define<ulong>("FileId", "Nexus File Id")
.Define<ulong>("ModId", "The name of the mod")
.Build();

public static readonly ModelDefinition ModelDefinition =
new ModelDefinitionBuilder()
.Inherits<Mod>()
.Include<FileId, ModId>()
.Build();
}
```

### Updating
For a single update to a specific attribute on an entity, the `ITransaction` interface has an emit method to update a specific datom:

```csharp

var someMod = db.GetMod(id);

using var tx = conn.BeginTransaction();
// this seems too verbose, we'll think about this one
Mod.Enabled.Emit(tx, someMod.Id, false);
```
Data is stored in an abstract "block store". This store does not need to be highly optimized as the framework performs heavy amounts of caching and only rarely writes to the store.

0 comments on commit 7164834

Please sign in to comment.