-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUESTION]: <Snapshot Isolation Testing> #8952
Comments
Hi @seedoilz, Could you elaborate how you identified that snapshot isolation is violated? |
It is a timestamp-based determination method. All transactions are sorted in ascending order by commit timestamps. Then iterates through each transaction and, based on its start timestamps, determines which committed transactions it should have read, checking for consistency with the data it actually read. It also checks to see if concurrent transactions have write conflicts. |
If you could send across a way to reproduce the issue, we are happy to look into it. We saw something like this before too #8146 but later found out that the issue was with the application code. |
What you need to do is to download the [json file](https://box.nju.edu.cn/f/64a49141b4e44368bf41/) here and clone this [code](https://github.com/Tsunaou/dbcdc-runner). Then you need to set up the environment including Leiningen and Java (What Jepsen needs). Then you can run the following code in root directory of the code. (replacing the ${dbcop-workload-path} with the path of that json file). lein run test-all -w rw \
--txn-num 120000 \
--time-limit 43200 \
-r 10000 \
--node dummy-node \
--isolation snapshot-isolation \
--expected-consistency-model snapshot-isolation \
--nemesis none \
--existing-postgres \
--no-ssh \
--database dgraph \
--dbcop-workload-path ${dbcop-workload-path} \
--dbcop-workload |
|
Sorry. Actually this compose file is not the one I used. However, I accidentally deleted my compose file. But my compose file is based on the one I gave. So I think it is not a big deal. |
This compose file is the one I was using. You could change a bit (node name) and use it.
|
@mangalaman93 hello, could you help me? |
Sorry about the delay. The compose files still doesn't look right because the same volume is mounted in all the alphas. |
After removing the volume, I get this null pointer exception. Am I running it right?
|
If you mount the same volume inside all zero and alphas, they will end up using the same p directory which is a problem. And why is the test trying to read a file that dgraph has written? I'm not sure how this compose file working for you. Am I missing something? |
I know why you have the null pointer exception. It is because that I forgot to tell u that you need to put this file in dbcdc-runner/resources/ |
It is running now, but I do not see any new predicate in the cluster. Is the code writing data into dgraph? |
Since we can not access the dgraph by 127.0.0.1, we use the public ip address to operate the database. |
It is still somehow hitting the 175.27.241.31 IP even after I have changed it everywhere as well run lein clean. |
Maybe you forgot to change the ip address in the .edn file that I gave you recently.
|
That was it. I do see 1000 values for |
I am able to do the complete run of the test now though it fails in the analysis step due to limited memory on my laptop. I am thinking of running it on a bigger machine but before that is it possible for you to share results of your run where you concluded that it failed for you? And how did you conclude that? |
By using this jar file. In addition, what I quote is what we do to test the snapshot isolation. The code is here: https://github.com/FertileFragrance/TimeKiller |
Question.
Snapshot Isolation Bug
Environment
There are 3 nodes in the cluster. The cluster is set on three servers based on the doc of the official website.
Docker version: 20.10.21
Dgraph version: 23.1.0(latest)
docker-compose.yml
How to send request
Code
How to generate the test cases
We use dbcop to generate test cases to do tests on Dgraph.
Bug
When we use our own algorithm to test the snapshot Isolation, we find that most of the transactions break the snapshot isolation.
I think that our own algorithm has no problem.
So, my question is that which step I have taken is wrong? Thank you in advance.
The text was updated successfully, but these errors were encountered: