Avoid deadlock on write transaction #942

goshaQ · 2020-02-06T10:40:54Z

I've noticed that writing relationship table (here) containing dozens of millions of rows with large batches (default value in the configuration, 100000) generates a lot of warnings like this

00:26:41 WARN RetryLogic: Transaction failed and will be retried in 1166ms
org.neo4j.driver.exceptions.TransientException: LockClient[x] can't wait on
resource RWLock[NODE(x), hash=x] since => LockClient[x] <-[:HELD_BY]- 
RWLock[NODE(x), hash=x] <-[:WAITING_FOR]- LockClient[x] <-[:HELD_BY]-
RWLock[NODE(x), hash=x]

I wonder whether this can be minimized by re-partitioning of the Spark DF. Maybe it is not actually needed to write relationships in parallel because of a lot of retried transactions?

The text was updated successfully, but these errors were encountered:

s1ck · 2020-06-15T11:15:48Z

What you see there is basically back-pressure from Neo4j not being able to handle as many concurrent relationship writes. One option would be to reduce concurrency, another would be to re-partition the relationship DF on source, target. However, depending on the degree distribution this could also lead to multiple partitions / threads that write relationships between the same nodes, but maybe less often than just random partitioning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid deadlock on write transaction #942

Avoid deadlock on write transaction #942

goshaQ commented Feb 6, 2020 •

edited

s1ck commented Jun 15, 2020

Avoid deadlock on write transaction #942

Avoid deadlock on write transaction #942

Comments

goshaQ commented Feb 6, 2020 • edited

s1ck commented Jun 15, 2020

goshaQ commented Feb 6, 2020 •

edited