Skip to content
You must be logged in to sponsor sirixdb

Become a sponsor to SirixDB

@sirixdb

SirixDB

sirixdb
Worldwide

SirixDB was started as a fork of a university project (University of Konstanz), called Treetank. Dr. Marc Kramis began to build the first version in 2006 with a few students for his Ph.D. thesis under the supervision of Dr. Marcel Waldvogel at the Distributed Systems group at the University of Konstanz.

As the project was nearly reaching its end, I (Johannes Lichtenberger, who began working on Treetank early on) have forked the project and kept on working on SirixDB in my spare time. At first, I introduced diffing capabilities between revisions. Furthermore, I've created interactive visualizations of the differences between revisions for my master thesis. Next, I began to work on a binding for Brackit(.org), a query engine based on XQuery and JSONiq for different storage engines with common query optimizations. You can add custom optimizations for a storage backend quickly through AST rewrites. Next, I've built several JSON layers to store binary JSON data and used some ideas from the Adaptive Radix Trie (ART) and Hash Array Mapped Tries (HAMT) to reduce the storage cost of pages with a lot of null references.

Furthermore, I've built a path summary and several custom index structures (name/field indexes, path indexes, content-and-structure indexes, and value indexes). SirixDB stores these indexes in the leaf nodes of subtrees of the RevisionRootPage. Database pages, which do not change between revisions, are referenced.

SirixDB allows the creation of several resources in a database (a collection of resources). It currently supports the import of XML and JSON files and stores these in a huge persistent, durable tree. An UberPage is the root of the resource tree and, thus, the main entry point. Underneath, SirixDB indexes revisions in a trie. A RevisionRootPage denotes the entry point into a specific resource. A second offset-file stores all revision-timestamps and the offsets into the main log-file for a particular revision to support binary search on the timestamps in-memory to fetch the specific RevisionRootPage from the log-file. SirixDB only ever appends data. A commit issues a postorder traversal of the in-memory transaction log and writes page fragments to durable storage. Each trie's last inner node level in the tree stores a predefined number of page fragments at most to the leaf data page fragments. A page fragment consists of currently changed nodes and nodes, which fall off a sliding window. Thus, SirixDB does not simply do the COW of a whole page, but stores batched fine-granular writes. Modern, byte-addressable, durable memory may be the best fit in the future to support small random reads of the page-fragments in parallel.

Moshe Uminer, as of the Hacktoberfest 2019, kept on working mainly on REST-API clients and a GUI web frontend for SirixDB and quickly became a core team member :-)

1 sponsor has funded sirixdb’s work.

@sirixdb

It would mean the world to us if we had 10 sponsors. 💖

@xuanurai

0% towards 10 monthly sponsors goal

Be the first to sponsor this goal!

Select a tier

$ a month

A Public Sponsor achievement will be added to your profile.