-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix the issue where FE restart fails when creating a table containing too many tablets #53062
[BugFix] Fix the issue where FE restart fails when creating a table containing too many tablets #53062
Conversation
nameToTable.remove(table.getName()); | ||
} | ||
} | ||
|
||
public void dropTable(String tableName, boolean isSetIfExists, boolean isForce) throws DdlException { | ||
Table table; | ||
Locker locker = new Locker(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
The unRegisterTableUnlocked
method treats temporary and non-temporary tables the same, which may lead to removal inconsistencies across different collections if there's additional logic intended for these types.
You can modify the code like this:
public void unRegisterTableUnlocked(Table table) {
if (table == null) {
return;
}
idToTable.remove(table.getId());
if (!table.isTemporaryTable()) {
nameToTable.remove(table.getName());
}
}
This modification ensures that both temporary and non-temporary tables are correctly removed from the idToTable
, but only non-temporary tables are removed from the nameToTable
, assuming the original intent was to handle them differently.
@Mergifyio rebase |
✅ Branch has been successfully rebased |
dd924b6
to
06b0dd8
Compare
@Mergifyio rebase |
✅ Branch has been successfully rebased |
06b0dd8
to
fd8e64f
Compare
fd8e64f
to
50082e8
Compare
@Mergifyio rebase |
Signed-off-by: gengjun-git <[email protected]>
50082e8
to
4dcac91
Compare
✅ Branch has been successfully rebased |
Quality Gate passedIssues Measures |
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 13 / 15 (86.67%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
@Mergifyio backport branch-3.4 |
@Mergifyio backport branch-3.3 |
@Mergifyio backport branch-3.2 |
@Mergifyio backport branch-3.1 |
✅ Backports have been created
|
✅ Backports have been created
|
✅ Backports have been created
|
✅ Backports have been created
|
…ontaining too many tablets (#53062) ## Why I'm doing: Failures in serialization of log data should be thrown instead of ignored. Ignoring the error will write an empty log to bdb, causing FE startup failure. ## What I'm doing: Fix ``` 2024-11-20 14:28:03.900+08:00 ERROR (stateChangeExecutor|79) [BDBJournalCursor.deserializeData():253] fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> java.io.EOFException: null at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] 2024-11-20 14:28:03.919+08:00 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournalInner():1954] catch exception when replaying journal, id: 13159, data: null, com.starrocks.journal.JournalException: fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:254) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] ``` Signed-off-by: gengjun-git <[email protected]> (cherry picked from commit eb61f07)
…ontaining too many tablets (#53062) ## Why I'm doing: Failures in serialization of log data should be thrown instead of ignored. Ignoring the error will write an empty log to bdb, causing FE startup failure. ## What I'm doing: Fix ``` 2024-11-20 14:28:03.900+08:00 ERROR (stateChangeExecutor|79) [BDBJournalCursor.deserializeData():253] fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> java.io.EOFException: null at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] 2024-11-20 14:28:03.919+08:00 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournalInner():1954] catch exception when replaying journal, id: 13159, data: null, com.starrocks.journal.JournalException: fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:254) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] ``` Signed-off-by: gengjun-git <[email protected]> (cherry picked from commit eb61f07)
…ontaining too many tablets (#53062) ## Why I'm doing: Failures in serialization of log data should be thrown instead of ignored. Ignoring the error will write an empty log to bdb, causing FE startup failure. ## What I'm doing: Fix ``` 2024-11-20 14:28:03.900+08:00 ERROR (stateChangeExecutor|79) [BDBJournalCursor.deserializeData():253] fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> java.io.EOFException: null at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] 2024-11-20 14:28:03.919+08:00 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournalInner():1954] catch exception when replaying journal, id: 13159, data: null, com.starrocks.journal.JournalException: fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:254) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] ``` Signed-off-by: gengjun-git <[email protected]> (cherry picked from commit eb61f07) # Conflicts: # fe/fe-core/src/test/java/com/starrocks/server/LocalMetaStoreTest.java
…ontaining too many tablets (#53062) ## Why I'm doing: Failures in serialization of log data should be thrown instead of ignored. Ignoring the error will write an empty log to bdb, causing FE startup failure. ## What I'm doing: Fix ``` 2024-11-20 14:28:03.900+08:00 ERROR (stateChangeExecutor|79) [BDBJournalCursor.deserializeData():253] fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> java.io.EOFException: null at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] 2024-11-20 14:28:03.919+08:00 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournalInner():1954] catch exception when replaying journal, id: 13159, data: null, com.starrocks.journal.JournalException: fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:254) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] ``` Signed-off-by: gengjun-git <[email protected]> (cherry picked from commit eb61f07) # Conflicts: # fe/fe-core/src/test/java/com/starrocks/server/LocalMetaStoreTest.java
…ontaining too many tablets (StarRocks#53062) ## Why I'm doing: Failures in serialization of log data should be thrown instead of ignored. Ignoring the error will write an empty log to bdb, causing FE startup failure. ## What I'm doing: Fix ``` 2024-11-20 14:28:03.900+08:00 ERROR (stateChangeExecutor|79) [BDBJournalCursor.deserializeData():253] fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> java.io.EOFException: null at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] 2024-11-20 14:28:03.919+08:00 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournalInner():1954] catch exception when replaying journal, id: 13159, data: null, com.starrocks.journal.JournalException: fail to read journal entity key=13159, data=<DatabaseEntry offset="0" size="2" data="50 200 "/> at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:254) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:292) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1933) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1892) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1221) ~[starrocks-fe.jar:?] at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:745) ~[starrocks-fe.jar:?] at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:107) ~[starrocks-fe.jar:?] Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:397) ~[?:?] at com.starrocks.common.io.Text.readString(Text.java:391) ~[starrocks-fe.jar:?] at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:249) ~[starrocks-fe.jar:?] at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:248) ~[starrocks-fe.jar:?] ``` Signed-off-by: gengjun-git <[email protected]>
…ontaining too many tablets (backport #53062) (#53351) Co-authored-by: gengjun-git <[email protected]>
…ontaining too many tablets (backport #53062) (#53352) Co-authored-by: gengjun-git <[email protected]>
…ontaining too many tablets (backport #53062) (#53354) Signed-off-by: gengjun-git <[email protected]> Co-authored-by: gengjun-git <[email protected]>
…ontaining too many tablets (backport #53062) (#53353) Signed-off-by: gengjun-git <[email protected]> Co-authored-by: gengjun-git <[email protected]>
Why I'm doing:
Failures in serialization of log data should be thrown instead of ignored.
Ignoring the error will write an empty log to bdb, causing FE startup failure.
What I'm doing:
Fix
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: