Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unassigned shard allocation for batch mode #13748

Merged

Conversation

SwethaGuptha
Copy link
Contributor

@SwethaGuptha SwethaGuptha commented May 20, 2024

Description

This change fixes issues shard allocation in batch batch. Changes in the PR :

  1. Move decision evaluation and execution of decision together for each unassigned shard of a batch to ensure the cluster state is updated before we make decision for another shards
  2. In-order to correctly handle cases where shards have multiple unassigned replicas, evaluation and decisions execution will be done for all replica shards of a shard if the shard is part of the batch that needs to be allocated. This is required as our batches will contain only one entry per shard id.

Related Issues

Resolves #13702, #13962

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • API changes companion pull request created.
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for c0aa1f6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 828b854: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 828b854: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for b452150: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for b452150: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for b452150: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shwetathareja
Copy link
Member

@SwethaGuptha the newly added tests are failing gardle check - https://build.ci.opensearch.org/job/gradle-check/40679/

Copy link
Contributor

@rajiv-kv rajiv-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes !

@SwethaGuptha
Copy link
Contributor Author

❌ Gradle check result for b452150: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Fixed in #14107

Copy link
Contributor

❌ Gradle check result for 735d561: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@SwethaGuptha SwethaGuptha reopened this Jun 12, 2024
Copy link
Contributor

❕ Gradle check result for 48f3402: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Jun 12, 2024

Codecov Report

Attention: Patch coverage is 82.75862% with 10 lines in your changes missing coverage. Please review.

Project coverage is 72.14%. Comparing base (b15cb0c) to head (48f3402).
Report is 410 commits behind head on main.

Files Patch % Lines
...opensearch/gateway/PrimaryShardBatchAllocator.java 76.19% 2 Missing and 3 partials ⚠️
...opensearch/gateway/ReplicaShardBatchAllocator.java 86.48% 2 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #13748      +/-   ##
============================================
+ Coverage     71.42%   72.14%   +0.71%     
- Complexity    59978    62407    +2429     
============================================
  Files          4985     5117     +132     
  Lines        282275   291684    +9409     
  Branches      40946    42167    +1221     
============================================
+ Hits         201603   210421    +8818     
- Misses        63999    64186     +187     
- Partials      16673    17077     +404     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

✅ Gradle check result for 48f3402: SUCCESS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch bug Something isn't working Cluster Manager skip-changelog
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

[BUG] Node concurrent recoveries settings not being honoured.
3 participants