Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(transaction): Use single hop in squashing when possible #2376

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dranikpg
Copy link
Contributor

@dranikpg dranikpg commented Jan 4, 2024

Idea:

If we determine that squashed commands form only a single batch, we can use ScheduleSingleHop() for that batch, which:

  • Reduces hops from 3 to 2 (no unlock)
  • Enables "quick" runs for single shard cases

Without contended keys it gives no improvements 馃槥, but with a contended it's potentially up to two times faster 馃檪

@dranikpg dranikpg force-pushed the tx-opt branch 2 times, most recently from d3374b9 to e704c04 Compare January 4, 2024 18:53
@dranikpg dranikpg requested a review from chakaz January 5, 2024 05:50
@romange romange self-requested a review January 5, 2024 06:37
// If all commands fit into a single batch, run them as a real single hop without multi overhead.
// Doesn't work with replication, so disallow if inline scheduling is not allowed.
bool singlehop_possible = IsAtomic() && !tx->IsScheduled() && cmds_.empty();
if (singlehop_possible && ServerState::tlocal()->AllowInlineScheduling()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not safe obviously... making this stuff work with replication is a little cumbersome

@dranikpg
Copy link
Contributor Author

dranikpg commented Jan 7, 2024

There are two options to running a single hop: either keep the base tx multi or not. With single-shard EVAL we decided to reduce it to a regular transaction, which could have even been done much easier by just removing the multi flag after proper initialization.

With MULTI/EXEC we have to handle replication, mainly sending the closing EXEC journal message with shard count, etc. If we decide to stick to non-multi single hops, we have to emulate it ourselves.

Instead, I changed the logic, so that if Schedule/ScheduleSingleHop will be called on a multi that hasn't scheduled yet, it'll do so according to the rules of Schedule or SSH. What is more, if we call SSH, we'll set the concluding bit, so UnlockMulti's job will be done when concluding execution, reducing the number of hops from 3 to 2 for multi-shard cases

Copy link
Collaborator

@chakaz chakaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, a few clarification questions :)

@@ -209,7 +209,8 @@ class Transaction {
void StartMultiGlobal(DbIndex dbid);

// Start multi in LOCK_AHEAD mode with given keys.
void StartMultiLockedAhead(DbIndex dbid, CmdArgList keys);
// Scheduling can be optionally disabled to allow more fine-grained control.
void StartMultiLockedAhead(DbIndex dbid, CmdArgList keys, bool skip_scheduling = false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use an enum instead of boolean argument here. kSkipScheduling at the call site is more readable and less error prone (for example, I'd assume that the meaning would be schedule rather than skip_scheduling)

@@ -1721,7 +1721,7 @@ optional<bool> StartMultiEval(DbIndex dbid, CmdArgList keys, ScriptMgr::ScriptPa
trans->StartMultiGlobal(dbid);
return true;
case Transaction::LOCK_AHEAD:
trans->StartMultiLockedAhead(dbid, keys);
trans->StartMultiLockedAhead(dbid, keys, true);
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're returning true here to indicate that the transaction was scheduled, but you pass true to StartMultiLockedAhead() to skip scheduling, is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function as a whole determines whether the script needs to schedule at all (not if no keys are present) and schedules. What I care about is whether it needs to schedule. if it already scheduled can be be looked up on the transaction itself

// If script runs on a single shard, we run it remotely to save hops
if (!tx->IsScheduled() && tx->GetMultiMode() == Transaction::LOCK_AHEAD &&
tx->GetUniqueShardCnt() == 1) {
DCHECK(*scheduled); // because tx multi mode is lock ahead
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the meaning of *scheduled == true while !tx->IsScheduled() - what does it mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that we need to schedule but didn't yet, will update the names

if (*multi_mode != Transaction::NOT_DETERMINED) {
StartMultiExec(cntx->db_index(), cntx->transaction, &exec_info, *multi_mode);
StartMultiExec(cntx->db_index(), cntx->transaction, &exec_info, *multi_mode, delay_scheduling);
scheduled = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might depend on the value of delay_scheduling, no?
I.e., shouldn't you do something like scheduled = StartMultiExec(...);?


auto check_cb = [this](ShardId sid) { return !sharded_[sid].cmds.empty(); };
tx->PrepareSquashedMultiHop(base_cid_, check_cb);
tx->ScheduleSingleHop(run_cb);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - how can we do ScheduleSingleHop here if we determined that a single hop is not possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we scheduled above and this is no longer a single hop for the whole transaction, but rather just a multi call

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth adding a comment why SSH here will perform multiple hops


if (multi_->role != SQUASHED_STUB) // stub transactions don't migrate between threads
// stub transactions don't migrate between threads, so keep it's index cached
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// stub transactions don't migrate between threads, so keep it's index cached
// stub transactions don't migrate between threads, so keep its index cached

@dranikpg dranikpg force-pushed the tx-opt branch 2 times, most recently from 86794b0 to d160489 Compare January 8, 2024 10:38
@dranikpg dranikpg marked this pull request as ready for review January 8, 2024 11:06
@dranikpg
Copy link
Contributor Author

dranikpg commented Jan 8, 2024

Still needs some polishment, I also need to check the replica side more in detail

args.async = false;
CallFromScript(cntx, args);
});
bool embedded = tx->GetMultiMode() != Transaction::NOT_DETERMINED;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename embedded to was_undetermined and reverse the condition here and below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was_undetermined doesn't explain why it should be so, whether it's embedded or not is what we care about and we check it by multi mode determinedness

Copy link
Collaborator

@romange romange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. it's quite large change with lots of conditions. Do you think it is possible to split it into smaller ones?
  2. I do not see new tests besides that small unit test. We know for sure that we have a memory leak during the replication. If this PR fixes it, can you add a (py)test reproducing the slave leak and showing it's been fixed here?

@dranikpg
Copy link
Contributor Author

dranikpg commented Feb 3, 2024

it's quite large change with lots of conditions. Do you think it is possible to split it into smaller ones?

I can split the core logic and then it's uses (eval, exec and squasher), but there is no way to test each part without the full changes. Besides they're all in separate places, so it's not the depth of changes, but the range, each part is independent

I do not see new tests besides that small unit test. We know for sure that we have a memory leak during the replication. If this PR fixes it, can you add a (py)test reproducing the slave leak and showing it's been fixed here?

"memory leak"? Our current "fix" disables multi mode, so we just don't send the replication report. I will add a test


// If all commands fit into a single batch, it can be run as a single hop, without separate hops for
// locking and unlocking
TEST_F(MultiTest, MultiSingleHop) {
auto fb0 = pp_->at(0)->LaunchFiber([&] {
for (unsigned i = 0; i < 100; i++) {
Run({"multi"});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add another call to Run({"rpush", "a", "bar"}); inside the multi transaction here and check after execution size of a to check that both commands are executed

if (slot_id.has_value()) {
unique_slot_checker_.Add(*slot_id);
}
MultiUpdateWithParent(parent);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused param slot_id passed to constructor

@dranikpg
Copy link
Contributor Author

dranikpg commented Mar 8, 2024

I wanted to simplify journaling so that stub transactions report directly to the main transaction, this way we don't need both ReportWritesSquashedMulti and FIX_ConcludeJournalExec at all, but with a single hop, we can't know the full journal number when the first shard concludes 馃槥

@romange
Copy link
Collaborator

romange commented Apr 3, 2024

consider rebasing

@dranikpg
Copy link
Contributor Author

dranikpg commented Apr 3, 2024

Will do. This PR will become much easier because we don't need to use ScheduleSingleHop as it's no longer special

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants