-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thanos compactor: block missing index file #1199
Comments
Thanks for this. Just to check you removed index manually or compactor/source upload process made such block in this way? (this is unexpected). As for how to mitigate the situations like this, there is literally zero we can do. We could reproduce |
I did not remove the index file. The compactor/source upload process made it this way. I should have clarified that in the original post. Sounds like my only course of action is to delete that block in S3? We're running a HA prometheus w/ 2 instances (each with a thanos sidecar) so hopefully this just means that we'll only have 1 source of data during that block time window? |
Also, in case you're interested in the logs from the event, here are 2 excerpts. The first is from filtering for just that block name ( **Filtering for block**
**All logs**
From the logs it appears it succeeded on the first pass of downsampling, but failed on the 2nd pass. Then all future attempts are failing on the first pass. |
Hi, sorry for the inactivity. Does the issue still persist? Possibly did you try upgrading to recent 0.5.0? There should be some compactor fixes maybe it would solve your issue. |
@FUSAKLA I ended up just deleting the bad block in S3. It's unclear what the source of the issue was, but this issue can probably be closed as the compactor is working fine now. |
Ok, thanks for the info. Sorry to hear you had to delete the affected block. Please feel free to reopen this issue or create a new one if it shows up again. |
We're seeing a similar issue at the moment, no real clue yet how the block got into this state, but the compactor is in a crash loop as a result. |
This stuff also happened to me I was able to reproduce it by trying to download block with s3cmd
The result was a missing index file in random situation. It seems it is correlated with Obj storage load (f.x. StorageGrid). And this is not just index file, in my case file /chunks/000001 was missing. Thats weird... |
Thanos, Prometheus and Golang version used
improbable/thanos:v0.4.0-rc.1
What happened
Thanos compactor continues to crash due to a block missing it's
index
file. Obviously I could just delete the block, but if there is a way to salvage the data without deleting the block I'd prefer to do that. Logs are below.What you expected to happen
The block to have an index file so the compactor could run.
How to reproduce it (as minimally and precisely as possible):
Remove index file from a block
Full logs to relevant components
Anything else we need to know
The compactor has been running fine for weeks. This is just one bad block and I didn't see any other comments or issues about how to recover from this situation so I figured I'd make one. Like I said I'm sure I could just delete the block but I'd prefer to salvage it if possible.
Here is a picture of the block contents in S3:
You can see it has an
index.cache.json
and ameta.json
but noindex
. Here is a healthy block:The text was updated successfully, but these errors were encountered: