Skip to content

Commit

Permalink
Merge pull request #723 from nygard89/master
Browse files Browse the repository at this point in the history
[riak] Added a workaround to allow strong-consistent scan transactions.
  • Loading branch information
busbey committed May 3, 2016
2 parents 6834e6b + f593bad commit a372340
Show file tree
Hide file tree
Showing 4 changed files with 186 additions and 93 deletions.
30 changes: 25 additions & 5 deletions riak/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Riak KV Client for Yahoo! Cloud System Benchmark (YCSB)

The Riak KV YCSB client is designed to work with the Yahoo! Cloud System Benchmark (YCSB) project (https://github.com/brianfrankcooper/YCSB) to support performance testing for the 2.x.y line of the Riak KV database.

Creating a <i>bucket type</i> to use with YCSB
Creating a <i>bucket-type</i> to use with YCSB
----------------------------

Perform the following operations on your Riak cluster to configure it for the benchmarks.
Expand All @@ -31,8 +31,11 @@ Set the default backend for Riak to <i>LevelDB</i> in the `riak.conf` file of ev
```
storage_backend = leveldb
```
After this, create a bucket type named "ycsb"<sup id="a1">[1](#f1)</sup> by logging into one of the nodes in your cluster. Now you're ready to set up the cluster to operate using one between strong and eventual consistency model as shown in the next two subsections.

Now, create a bucket type named "ycsb"<sup id="a1">[1](#f1)</sup> by logging into one of the nodes in your cluster. Then, to use the <i>strong consistency model</i><sup id="a2">[2](#f2)</sup> (default), you need to follow the next two steps.
###Strong consistency model

To use the <i>strong consistency model</i> (default), you need to follow the next two steps.

1. In every `riak.conf` file, search for the `##strong_consistency=on` line and uncomment it. It's important that you do this <b>before you start your cluster</b>!
2. Run the following `riak-admin` commands:
Expand All @@ -42,9 +45,24 @@ Now, create a bucket type named "ycsb"<sup id="a1">[1](#f1)</sup> by logging int
riak-admin bucket-type activate ycsb
```

Note that when using the strong consistency model, you **may have to specify the number of replicas to create for each object**. The *R* and *W* parameters (see next section) will in fact be ignored. The only information needed by this consistency model is how many nodes the system has to successfully query to consider a transaction completed. To set this parameter, you can add `"n_val":N` to the list of properties shown above (by default `N` is set to 3).
When using this model, you **may want to specify the number of replicas to create for each object**<sup id="a2">[2](#f2)</sup>: the *R* and *W* parameters (see next section) will in fact be ignored. The only information needed by this consistency model is how many nodes the system has to successfully query to consider a transaction completed. To set this parameter, you can add `"n_val":N` to the list of properties shown above (by default `N` is set to 3).

####A note on the scan transactions
Currently, `scan` transactions are not _directly_ supported, as there is no suitable mean to perform them properly. This will not cause the benchmark to fail, it simply won't perform any scan transaction at all (these will immediately return with a `Status.NOT_IMPLEMENTED` code).

However, a possible workaround has been provided: considering that Riak doesn't allow strong-consistent bucket-types to use secondary indexes, we can create an eventually consistent one just to store (*key*, *2i indexes*) pairs. This will be later used only to obtain the keys where the objects are located, which will be then used to retrieve the actual objects from the strong-consistent bucket. If you want to use this workaround, then you have to create and activate a "_fake bucket-type_" using the following commands:
```
riak-admin bucket-type create fakeBucketType '{"props":{"allow_mult":"false","n_val":1,"dvv_enabled":false,"last_write_wins":true}}'
riak-admin bucket-type activate fakeBucketType
```
A bucket-type so defined isn't allowed to _create siblings_ (`allow_mult":"false"`), it'll have just _one replica_ (`"n_val":1`) which'll store the _last value provided_ (`"last_write_wins":true`) and _vector clocks_ will be used instead of _dotted version vectors_ (`"dvv_enabled":false`). Note that setting `"n_val":1` means that the `scan` transactions won't be much *fault-tolerant*, considering that if a node fails then a lot of them could potentially fail. You may indeed increase this value, but this choice will necessarily load the cluster with more work. So, the choice is yours to make!
Then you have to set the `riak.strong_consistent_scans_bucket_type` property (see next section) equal to the name you gave to the aforementioned "fake bucket-type" (e.g. `fakeBucketType` in this case).

Please note that this workaround involves a **double store operation for each insert transaction**, one to store the actual object and another one to save the corresponding 2i index. In practice, the client won't notice any difference, as the latter operation is performed asynchronously. However, the cluster will be obviously loaded more, and this is why the proposed "fake bucket-type" to create is as less _resource-demanding_ as possible.

If instead you want to use the <i>eventual consistency model</i> implemented in Riak, then type:
###Eventual consistency model

If you want to use the <i>eventual consistency model</i> implemented in Riak, you have just to type:
```
riak-admin bucket-type create ycsb '{"props":{"allow_mult":"false"}}'
riak-admin bucket-type activate ycsb
Expand All @@ -63,10 +81,12 @@ You can either specify these configuration parameters via command line or set th
* `riak.wait_time_before_retry` - <b>int</b>, the time (in milliseconds) before the client attempts to perform another read if the previous one failed.
* `riak.transaction_time_limit` - <b>int</b>, the time (in seconds) the client waits before aborting the current transaction.
* `riak.strong_consistency` - <b>boolean</b>, indicates whether to use *strong consistency* (true) or *eventual consistency* (false).
* `riak.strong_consistent_scans_bucket_type` - **string**, indicates the bucket-type to use to allow scans transactions when using strong consistency mode.
* `riak.debug` - <b>boolean</b>, enables debug mode. This displays all the properties (specified or defaults) when a benchmark is started. Moreover, it shows error causes whenever these occur.

<b>Note</b>: For more information on workloads and how to run them please see: https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload

<b id="f1">1</b> As specified in the `riak.properties` file. See parameters configuration section for further info. [](#a1)

<b id="f2">2</b> <b>IMPORTANT NOTE:</b> Currently the `scan` transactions are <b>NOT SUPPORTED</b> for the benchmarks which use the strong consistency model! However this will not cause the benchmark to fail, it simply won't perform any scan transaction at all. These latter will immediately return with a `Status.NOT_IMPLEMENTED` code. [](#a2)
<b id="f2">2</b> More info about properly setting up a fault-tolerant cluster can be found at http://docs.basho.com/riak/kv/2.1.4/configuring/strong-consistency/#enabling-strong-consistency.[↩](#a2)

Loading

0 comments on commit a372340

Please sign in to comment.