You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 27, 2021. It is now read-only.
100's of Tracing Spans are left un-ended from every query timeout
I am a prism goalie
Who wants to have a stable heroic
So that I can focus on features and not get woken up at night and have angry users
These un-ended spans represent a real runtime risk to heroic. If ~700-1000 of these are left hanging around after each timeout-d query, it's conceivable that the JVM will :
potentially run out of memory altogether
experience much longer GC pauses / sweep times (cos of all the hanging spans needing reaping)
hugely inflate the size of heroic's logs, costing us $$$ and obscuring "genuine" problems
Proposed Solution
find the correct location to catch the BT timeout exception (not trivial)
catch it, end the span and throw it out again
Repro Steps
run heroic locally with GUC config and on branch feature/add-bigtable-timeout-settings-refactored
capture a lengthy query from grafana using the chrome dev tools network tab
alter the query to hit localhost and watch the logs, you'll see this message
List of methods concerned from logs
ERROR io.opencensus.trace.Tracer - Span localMetricsManager.fetchSeries is GC'ed without being ended.
ERROR io.opencensus.trace.Tracer - Span bigtable.fetchBatch is GC'ed without being ended.
The text was updated successfully, but these errors were encountered:
sming
changed the title
Fix "...Span localMetricsManager.fetchSeries is GC'ed without being ended." issue (caused by a BT timeout)
Fix "...Span <span name> is GC'ed without being ended." issue (caused by a BT timeout)
Feb 12, 2021
100's of Tracing Spans are left un-ended from every query timeout
These un-ended spans represent a real runtime risk to heroic. If ~700-1000 of these are left hanging around after each timeout-d query, it's conceivable that the JVM will :
Proposed Solution
catch
the BT timeout exception (not trivial)catch
it, end the span andthrow
it out againRepro Steps
List of methods concerned from logs
The text was updated successfully, but these errors were encountered: