-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle transitions differently for MySQL #399
Conversation
StatesMan will update the old transitions to `most_recent: nil` and then insert the new transition as `most_recent: true`. However that can cause deadlocks on MySQL when running concurrent state transitions. On the update, MySQL will hold a gap lock to the unique indexes there are, so other transactions cannot insert. After two updates on the most_recent and two inserts, the second insert will fail with a deadlock. For more explanation see the PR description that includes this commit.
#350 @arthurnn opened #350 to reduce the impact of Statesman taking gap locks when using MySQL. The crux of the issue is that MySQL's implementation of REPEATABLE READ can take wide locks when an update touches no rows, which happens frequently on the first transition of Statesman. By first creating the new transition, we can avoid issuing an update that will take the large gap lock. This order of queries meant we added an additional query to the transition step which could impact people who rely on low-latency Statesman transitions. This commit is another take on the same approach that collapses two queries into one, by taking the update of the old and new transition's most_recent column and updating them together. It's slightly janky but if robust, would be a good alternative to avoid additional latency.
This commit first refactors the construction of transition SQL statements to use Arel. This is safer and hopefully more readable than constructing large SQL statements using strings. It also fixes a bug with transition updates where MySQL would throw index violations. This was caused by MySQL validating index constraints across a partially applied update, where Statesman would set the latest transition's most_recent = 't' before unsetting the previous.
rails/rails@7508284 When merged with Rails core, Arel removed an initialisation parameter that was unused in all but a few of the Arel features. For now, provide a helper local to the ActiveRecord adapter that can construct Manager classes independent of the API change.
To avoid a gap lock with MySQL, first insert the new transition with most_recent=false and then flip the values for the latest two transitions in order.
f3ef679
to
eedc7a7
Compare
end | ||
|
||
def db_mysql? | ||
::ActiveRecord::Base.connection.adapter_name.downcase.starts_with?("mysql") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way we could avoid special casing this for mysql?
If that's not possible could we make this a configuration option instead? Selfish reasoning: we use an internal adapter that is not called mysql
. I can monkey patch this method internally, but if there is any way to avoid it that would be sweet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a much better approach actually, makes this change opt-in which is great. Will try find time later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the not making it a special case, I did try implementing a platform agnostic strategy but the requirements of Postgres and MySQL were too difficult to get them to marry. Where MySQL could do X, Postgres would suffer, and vise versa.
Agree with what you're saying though, providing a configuration option for this would be best. We can even default it to true if you're MySQL, while giving you everything you need.
Thanks for the prompt!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds great, and the explanation makes sense 👍
Scope creep: at this point, should we consider creating separate Statesman::Adapter
s (one for postgres, one for mysql) for the different AR adapters? That way we can avoid some of the branching that we see creeping in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be a good idea, though if we're allowing this to be configurable, I guess you'd want something more like a mixin. adapter.include(Statesman::Adapters::MySQLGapLock
or something.
That said, let's leave that until after we've got this in. I'm conscious of how long this has been hanging, very keen to deliver the value then clean-up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Managed to get this passing this morning, took way longer than it should have as you can see with all the force pushes 🙈 lemme know what you think. I think config may need a small refactor next time we add one of these configuration options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great, thanks so much 🎉 ❤️
9283d76
to
0fd1ce8
Compare
lib/statesman/config.rb
Outdated
|
||
def initialize(block = nil) | ||
@gaplock_protection_enabled = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've mixed calling this gaplock_protection and mysql_gaplock_protection. Can we change things so we have one uniform reference, preferably mysql_gaplock_protection
in all places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should set this parameter to be true
when the adapter looks like mysql. Otherwise all the mysql statesman users end up with a crucial performance optimisation not enabled, which sucks.
Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts on this were that we don't want people with postgres enabling this, so the API from the config perspective should make that clear. We do something similar in storage_adapter
. No strong opinions though, did that make sense?
I agree we should enable this by default, I can't remember why I didn't.
Statesman.configure do | ||
storage_adapter(Statesman::Adapters::ActiveRecord) | ||
# These ENV vars are only set on the MySQL builds | ||
mysql_gaplock_protection ENV.fetch("GAPLOCK_PROTECTION", false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could avoid this if we default to true when we detect mysql
0fd1ce8
to
74a2aa3
Compare
74a2aa3
to
2aa0c22
Compare
899c02f
to
03fa2c4
Compare
03fa2c4
to
55be707
Compare
55be707
to
23f8819
Compare
23f8819
to
c4e1893
Compare
c4e1893
to
ad7eb4d
Compare
ad7eb4d
to
532609c
Compare
532609c
to
99f2976
Compare
99f2976
to
4b30bc6
Compare
b996a1f
to
e1c2c87
Compare
hey @jurre sorry for the further delays on this, still struggling to find the time so I thought I'd send an update haha. Seems the only remaining failures are with Rails master and postgres, some gem compatibility issues. Hoping it doens't take too long, but apologies again and thanks for your patience 🙏 |
This reflects a recent change in Rails (rails/rails@592358e)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks ready to go. Let's get this merged to master, then we (GC) can upgrade our apps to point at the master version before we cut a new release, just to make sure everything looks rosy.
Thanks @lawrencejones and @thom-oman! |
🎉 thanks so much @lawrencejones, @thom-oman, @arthurnn and @greysteil |
Prior to 7.2.0 the code to unset the most_recent column of old transitions when a new transition was created relied on transitions_for_parent, which would build the correct query in the case that the transition class was using single table inheritance. However, after #399, the code no longer uses the parent association to create the query and the current custom query lacks this functionality that existed prior to 7.2.0. This PR addresses this issue by adding an AND clause to the query to filter by transition type if the transition class contains an inheritance column.
* Inverse the order of writes on statesman transitions StatesMan will update the old transitions to `most_recent: nil` and then insert the new transition as `most_recent: true`. However that can cause deadlocks on MySQL when running concurrent state transitions. On the update, MySQL will hold a gap lock to the unique indexes there are, so other transactions cannot insert. After two updates on the most_recent and two inserts, the second insert will fail with a deadlock. For more explanation see the PR description that includes this commit. * Make sure the initial most_recent state is false/nil * Lock-lesser transitions without additional queries gocardless#350 @arthurnn opened gocardless#350 to reduce the impact of Statesman taking gap locks when using MySQL. The crux of the issue is that MySQL's implementation of REPEATABLE READ can take wide locks when an update touches no rows, which happens frequently on the first transition of Statesman. By first creating the new transition, we can avoid issuing an update that will take the large gap lock. This order of queries meant we added an additional query to the transition step which could impact people who rely on low-latency Statesman transitions. This commit is another take on the same approach that collapses two queries into one, by taking the update of the old and new transition's most_recent column and updating them together. It's slightly janky but if robust, would be a good alternative to avoid additional latency. * fmt * Replace Arel#or with raw SQL * Attempt to fix typecasting * Fix MySQL issue with ordered updates This commit first refactors the construction of transition SQL statements to use Arel. This is safer and hopefully more readable than constructing large SQL statements using strings. It also fixes a bug with transition updates where MySQL would throw index violations. This was caused by MySQL validating index constraints across a partially applied update, where Statesman would set the latest transition's most_recent = 't' before unsetting the previous. * Smooth-over breaking API change in Arel rails/rails@7508284 When merged with Rails core, Arel removed an initialisation parameter that was unused in all but a few of the Arel features. For now, provide a helper local to the ActiveRecord adapter that can construct Manager classes independent of the API change. * Handle transitions differently for MySQL To avoid a gap lock with MySQL, first insert the new transition with most_recent=false and then flip the values for the latest two transitions in order. * MySQL specific gaplock protection is configurable Instead of depending on the name of the ActiveRecord adapter to identify if the database is MySQL, gaplock protection is configurable, as suggested by @jurre on gocardless#399 * Loosen dependency on pg gem This reflects a recent change in Rails (rails/rails@592358e) Co-authored-by: Arthur Neves <[email protected]> Co-authored-by: Lawrence Jones <[email protected]> Co-authored-by: Grey Baker <[email protected]>
Prior to 7.2.0 the code to unset the most_recent column of old transitions when a new transition was created relied on transitions_for_parent, which would build the correct query in the case that the transition class was using single table inheritance. However, after gocardless#399, the code no longer uses the parent association to create the query and the current custom query lacks this functionality that existed prior to 7.2.0. This PR addresses this issue by adding an AND clause to the query to filter by transition type if the transition class contains an inheritance column.
To avoid a gap lock with MySQL, first insert the new transition with most_recent=false and then flip the values for the latest two transitions in order.
Includes #350 and #353