New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We had a spike in errors and after that 100% of errors are getting dropped, could someone help me figure out why? #2960
Comments
That is indeed interesting. I'm seeing |
@hubertdeng123 I'm not sure. It seems like there was a RAM bottleneck along with storage bottleneck. The docker directory ballooned in size to over 60GB. I increased the storage and RAM and reinstalled. Now Sentry is logging errors, i can see them come in... but in the stats page it shows that there were 32 errors and 32 of them were dropped. But if i look at the list of issues for this project for the last 7 days i have about 350 pages of issues. Errors are coming in but Sentry isnt counting them and it's considering them as dropped. |
It's quite difficult to debug this remotely - Sentry knows that some errors didn't make it all the way through the pipeline, but that's really all it knows, otherwise they wouldn't be dropped errors. Usually these sorts of things are related to connection issues between various containers (hence the dropping), memory limitations, or configuration at the orchestrator or cloud provider level. |
@azaslavsky Do you know if there’s a guide on how to rebuild/reinstall from scratch but retaining data like the projects themselves, user accounts, settings, etc? I don’t care if I lose all of the issues. Running ./install.sh doesn’t seem to be enough for me, I keep having issues. |
Yep, there is a backup/restore tool for exactly this use case: https://develop.sentry.dev/self-hosted/backup/#partial-json-backup |
@hubertdeng123 having this exact issue and getting absolutely spammed by the logs you mention above:
It is not clear to me at all why this started happening. Our instance has run for months without incident and there have been no changes I am aware of. What could cause it to lose connection to clickhouse? |
@csvan Have you updated your install recently? |
I'm not sure what happened but after updating to version 24.4.2 everything SEEMS to be working fine, I no longer have 100% errors dropped. I didnt change anything on our server. |
Self-Hosted Version
24.3.0 unknown
CPU Architecture
x86_64
Docker Version
24.0.7
Docker Compose Version
2.21.0
Steps to Reproduce
On April 8th (Monday) we experienced a spike in errors dropped. There was nothing peculiar going on this day, we didn't receive any complaints of downtime for our web application.
According to the stats page this started at 9am and from April 8th at 9am until today 100% of errors have been dropped.
I have rate limiting set up but that doesnt seem to be the cause as can be seen in screenshots below.
I don't see any warnings in the System Warnings page in the admin panel.
Anybody have any suggestions?
I'd love if Sentry showed a reason as to why the errors were dropped.
Expected Result
Expected errors to not be dropped.
Actual Result
Docker compose logs:
https://pastebin.com/raw/TXHJL7i3
Event ID
No response
The text was updated successfully, but these errors were encountered: