A database made of sugar cubes
Sucredb is a multi-master key-value distributed database, it provides a dynamo style tunable consistent and causality tracking.
Any node that owns a partition (replicas) can serve both reads and writes. The database tracks causality using vector-clocks and will NOT drop any conflicting writes unlike LWW (last write wins) and other strategies. Conflicts can and do happen due to races between clients and network partitions.
Status: Alpha quality with missing pieces.
Theoretically you can use Sucredb with any Redis Cluster clients.
It implements a tiny subset of Redis commands. Only basic Key-Value/Sets/Hashes operations are supported at this point.
GET result(s) is/are returned as an array containing the values (zero, one or more if there's conflicting versions) plus the causal context. The context is an binary string and is always returned as the last item of the array even if no values are present.
> GET key {consistency}
< [{value1}, {value2}, .., context]
MGET takes the # of keys (N) followed by N keys. Results are returned as an array.
> MGET key_count {key1} {key2} {..} {consistency}
< [[{value1_1}, {value1_2}, .., context], [{value2_1}, {value2_2}, .., context]]
SET, in addition to the key and value, also takes the causal context. If you're sure it don't exist you can actually omit the context, if you're wrong it'll create a conflicting version.
> SET key value {context} {consistency}
< OK
GETSET is similar to set, but returns the updated value(s) and a new context. Despite the name and the semantics in Redis, the get is always done after the set.
> GETSET key value context {consistency}
< [{value1}, {value2}, .., context]
DEL is like set and also requires a context when dealing with basic values. Following Redis api del works for keys with any datastructure, in these cases the context is ignored (you can use an empty string instead).
> DEL key context {consistency}
< 1 OR 0 (if not found)
Sucredb also supports a tiny subset of commands for Hash and Set datatypes in addition to a dedicated Counter type. These types are CRDTs and don't require a context to be sent along the operation. Mutations depend on the coordinator version of the value and conflicts are handled as follow:
- Hash: On values conflict the latest write wins.
- Set: On values conflict add wins.
- Counter: Deletes may erase non observed increments.
Returns the value for a counter or Nil if none is found.
> GET key {consistency}
< 1011
Sets the value for a counter.
> SET key int_value {consistency}
< OK
Increments the value for a counter, the delta can be either positive or negative.
> INCRBY key delta_value {consistency}
< resulting_int_value
Gets all key value pairs from a hash.
> HGETALL key {consistency}
< [{KA, VA}, {KB, VB}, ...]
Set key a value pair in a hash.
> HSET key hash_key value {consistency}
< 1 OR 0 (if hash_key already existed)
Deletes a key from a hash.
> HDEL key hash_key {consistency}
< 1 OR 0 (if hash_key didn't exist)
Gets all values from a set.
> SMEMBERS key {consistency}
< [{KA}, {KB}, ...]
Adds a value from the set.
> SADD key value {consistency}
< 1 OR 0 (if value already existed)
Removes a value from the set.
> SREM key value {consistency}
< 1 OR 0 (if value didn't exist)
todo
If you don't have a context (from a previous get or getset) you can send an empty string.
{consistency}
follows the dynamo/cassandra/riak style:
1
,o
,O
: Oneq
,Q
: Quoruma
,A
: All
Requirements
- Needs a reasonably recent Rust (nightly[2])
- C++ compiler (for Rocksdb).
Running
- The following setup will use the default settings.
- Clone the repo and enter repository root
cargo install .
[3]sucredb --help
Single/First instance
sucredb -d datadir1 -l 127.0.0.1:6379 -f 127.0.0.1:16379 init
The command above will initialize a new cluster containing this node. The cluster will have the default name, partition count and replication factor.
Second instance
sucredb -d datadir2 -l 127.0.0.1:6378 -f 127.0.0.1:16378 -s 127.0.0.1:16379
The second instance joins the cluster using the first instance as a seed.
Quick test
redis-cli CLUSTER SLOTS
Quick example using redis-cli
➜ ~ redis-cli
127.0.0.1:6379> GET there
1) "\x00\x00\x00\x00\x00\x00\x00\x00"
127.0.0.1:6379> SET there 1 "\x00\x00\x00\x00\x00\x00\x00\x00"
OK
127.0.0.1:6379> GET there
1) "1"
2) "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x01\x00\x00\x00\x00\x00\x00\x00"
127.0.0.1:6379> SET there 2
OK
127.0.0.1:6379> GET there 1
1) "1"
2) "2"
3) "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x02\x00\x00\x00\x00\x00\x00\x00"
127.0.0.1:6379> SET there 3 "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x02\x00\x00\x00\x00\x00\x00\x00"
OK
127.0.0.1:6379> GET there
1) "3"
2) "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x03\x00\x00\x00\x00\x00\x00\x00"
127.0.0.1:6379> GETSET there 4 "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x03\x00\x00\x00\x00\x00\x00\x00"
1) "4"
2) "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x04\x00\x00\x00\x00\x00\x00\x00"
127.0.0.1:6379> DEL there "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x04\x00\x00\x00\x00\x00\x00\x00"
1
127.0.0.1:6379> GET there q
1) "\x01\x00\x00\x00\x00\x00\x00\x00P\xb0n\x83g\xef`\n\x05\x00\x00\x00\x00\x00\x00\x00
See sucredb.yaml
To use configuration file use: sucredb -c sucredb.yaml
It behaves mostly like an AP system but not exactly.
Sucredb doesn't use sloppy quorum or hinted handoff so it can't serve requests that don't satisfy the requested/default consistency level.
Almost every single new thing claims to be fast or blazing fast. Sucredb makes no claims at this point, but it's probably fast.
The data structure operations move the entire collection around the cluster so it's not suitable for large values/collections.
- Improve the data model with a range/clustering key.
Storage takes advantage of RocksDB.
It uses a variant of version clocks to track causality. The actual algorithm is heavily inspired by [1].
[1] Gonçalves, Ricardo, et al. "Concise server-wide causality management for eventually consistent data stores."
[2] Mostly due to the try_from and impl trait features that should be stable soon.
[3] Be patient.