Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres floating-point exception but health check was ok #2958

Open
max-mycarly opened this issue Apr 15, 2024 · 1 comment
Open

Postgres floating-point exception but health check was ok #2958

max-mycarly opened this issue Apr 15, 2024 · 1 comment

Comments

@max-mycarly
Copy link

Self-Hosted Version

24.3.0

CPU Architecture

x86_64

Docker Version

25.0.3

Docker Compose Version

2.25.0

Steps to Reproduce

You try to load any admin page like
https://sentry.domain.com/organizations/[ORGA]/projects/
And receive a HTTP Code 500

But when you call
https://sentry.domain.com/_health/
You still get an HTTP Code 200 and the message: ok

Expected Result

When there is an error which causes all http request to fail with a HTTP Code 500, the health endpoint should also reflect this.

Actual Result

We experienced a strange error with Sentry.
The PostgreSQL Database started to respond with an error to all SELECT set_config queries.
Web, Cron, Worker all show the same errors caused by postgres.
All API endpoints, Admin interface etc have thrown server errors and a HTTP Code 500 but /_health/ was returning HTTP 200 and a OK.
The problem lastet for 5 hours because monitoring thought the service is still alive.

All services were running.
Restart the instance and all Sentry services fixed the problem.

docker compose logs postgres:

postgres-1  | 2024-04-13 11:14:11.218 UTC [763405] ERROR:  floating-point exception
postgres-1  | 2024-04-13 11:14:11.218 UTC [763405] DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero.
postgres-1  | 2024-04-13 11:14:11.218 UTC [763405] STATEMENT:  SELECT set_config('TimeZone', 'UTC', false)
postgres-1  | 2024-04-13 11:14:11.382 UTC [763406] ERROR:  floating-point exception
postgres-1  | 2024-04-13 11:14:11.382 UTC [763406] DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero.
postgres-1  | 2024-04-13 11:14:11.382 UTC [763406] STATEMENT:  SELECT set_config('TimeZone', 'UTC', false)
postgres-1  | 2024-04-13 11:14:11.415 UTC [763407] ERROR:  floating-point exception
postgres-1  | 2024-04-13 11:14:11.415 UTC [763407] DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero.
postgres-1  | 2024-04-13 11:14:11.415 UTC [763407] STATEMENT:  SELECT set_config('TimeZone', 'UTC', false)

docker compose logs web:

web-1  | psycopg2.errors.FloatingPointException: floating-point exception
web-1  | DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero.
web-1  | 
web-1  | 
web-1  | The above exception was the direct cause of the following exception:
web-1  | 
web-1  | Traceback (most recent call last):
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/api/base.py", line 306, in handle_exception
web-1  |     response = super().handle_exception(exc)
web-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/rest_framework/views.py", line 469, in handle_exception
web-1  |     self.raise_uncaught_exception(exc)
web-1  |   File "/usr/local/lib/python3.11/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
web-1  |     raise exc
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/api/base.py", line 411, in dispatch
web-1  |     self.initial(request, *args, **kwargs)
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/../sentry_sdk/integrations/django/__init__.py", line 312, in sentry_patched_drf_initial
web-1  |     return old_drf_initial(self, request, *args, **kwargs)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/rest_framework/views.py", line 414, in initial
web-1  |     self.perform_authentication(request)
web-1  |   File "/usr/local/lib/python3.11/site-packages/rest_framework/views.py", line 324, in perform_authentication
web-1  |     request.user
web-1  |   File "/usr/local/lib/python3.11/site-packages/rest_framework/request.py", line 227, in user
web-1  |     self._authenticate()
web-1  |   File "/usr/local/lib/python3.11/site-packages/rest_framework/request.py", line 380, in _authenticate
web-1  |     user_auth_tuple = authenticator.authenticate(self)
web-1  |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/api/authentication.py", line 197, in authenticate
web-1  |     return self.authenticate_credentials(relay_id, relay_sig, request)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/api/authentication.py", line 203, in authenticate_credentials
web-1  |     relay, static = relay_from_id(request, relay_id)
web-1  |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/api/authentication.py", line 128, in relay_from_id
web-1  |     relay = Relay.objects.get(relay_id=relay_id)
web-1  |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
web-1  |     return getattr(self.get_queryset(), name)(*args, **kwargs)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 645, in get
web-1  |     num = len(clone)
web-1  |           ^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 382, in __len__
web-1  |     self._fetch_all()
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 1928, in _fetch_all
web-1  |     self._result_cache = list(self._iterable_class(self))
web-1  |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 91, in __iter__
web-1  |     results = compiler.execute_sql(
web-1  |               ^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 1560, in execute_sql
web-1  |     cursor = self.connection.cursor()
web-1  |              ^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
web-1  |     return func(*args, **kwargs)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 316, in cursor
web-1  |     return self._cursor()
web-1  |            ^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/db/postgres/decorators.py", line 40, in inner
web-1  |     return func(self, *args, **kwargs)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/db/postgres/base.py", line 107, in _cursor
web-1  |     return super()._cursor()
web-1  |            ^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 292, in _cursor
web-1  |     self.ensure_connection()
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
web-1  |     return func(*args, **kwargs)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 274, in ensure_connection
web-1  |     with self.wrap_database_errors:
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
web-1  |     raise dj_exc_value.with_traceback(traceback) from exc_value
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 275, in ensure_connection
web-1  |     self.connect()
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/../sentry_sdk/integrations/django/__init__.py", line 677, in connect
web-1  |     return real_connect(self)
web-1  |            ^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
web-1  |     return func(*args, **kwargs)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 258, in connect
web-1  |     self.init_connection_state()
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 314, in init_connection_state
web-1  |     commit_tz = self.ensure_timezone()
web-1  |                 ^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 296, in ensure_timezone
web-1  |     cursor.execute(self.ops.set_time_zone_sql(), [timezone_name])
web-1  | django.db.utils.DataError: floating-point exception
web-1  | DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero.
web-1  | 
web-1  | 13:03:00 [INFO] sentry.access.api: api.access (method='POST' view='sentry.api.endpoints.relay.project_configs.RelayProjectConfigsEndpoint' response=500 user_id='None' is_app='None' token_type='None' is_frontend_request='False' organization_id='None' auth_id='None' path='/api/0/relays/projectconfigs/' caller_ip='172.18.0.44' user_agent='None' rate_limited='False' rate_limit_category='None' request_duration_seconds=0.04009842872619629 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')
web-1  | 13:03:00 [ERROR] django.request: Internal Server Error: /api/0/relays/projectconfigs/ (status_code=500 request=<WSGIRequest: POST '/api/0/relays/projectconfigs/?version=3'>)
web-1  | Traceback (most recent call last):
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 275, in ensure_connection
web-1  |     self.connect()
web-1  |   File "/usr/local/lib/python3.11/site-packages/sentry/../sentry_sdk/integrations/django/__init__.py", line 677, in connect
web-1  |     return real_connect(self)
web-1  |            ^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
web-1  |     return func(*args, **kwargs)
web-1  |            ^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/base/base.py", line 258, in connect
web-1  |     self.init_connection_state()
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 314, in init_connection_state
web-1  |     commit_tz = self.ensure_timezone()
web-1  |                 ^^^^^^^^^^^^^^^^^^^^^^
web-1  |   File "/usr/local/lib/python3.11/site-packages/django/db/backends/postgresql/base.py", line 296, in ensure_timezone
web-1  |     cursor.execute(self.ops.set_time_zone_sql(), [timezone_name])
web-1  | psycopg2.errors.FloatingPointException: floating-point exception
web-1  | DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero.
web-1  | 

Event ID

No response

@hubertdeng123
Copy link
Member

Thanks for reporting here. I do not think you should rely on the /_health/ endpoint in Sentry as a source of truth, I just took a look and it seems to be pretty out of date. I'll backlog this item to improve the endpoint to cover more of the main components of self-hosted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Status: No status
Development

No branches or pull requests

2 participants