Jetty threads starvation #42139

ky0uraku · 2024-05-02T05:41:10Z

Describe the bug

Hello.

I encountered the following problem in Metabase - during operation the number of busy jetty threads gradually increases, after which at some point the application stops responding to healthcheck probes:

At the same time, the number of HelperThreads in the thread dump increases. Here is an application dump with uptime for about a week:
metabase-20240502.dmp

Adding maxIdleTime and unreturnedConnectionTimeout parameters does not help. Updating the version from 0.45 to 0.48 also does not help.

Current configuration:

MB_JDBC_DATA_WAREHOUSE_MAX_CONNECTION_POOL_SIZE: 45
MB_JETTY_MAXTHREADS: 100

c3p0.idleConnectionTestPeriod=60
c3p0.maxIdleTime=300
c3p0.unreturnedConnectionTimeout=3600

To Reproduce

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior

The number of busy Jetty threads does not increase over time.

Logs

2024-04-26 09:31:08,696 INFO metabase.core :: Starting Metabase version v0.48.8 (a900c85)
2024-04-27 00:00:55,212 DEBUG middleware.log :: GET /api/health 200 158.6 µs (0 DB calls) App DB connections: 0/10 Jetty threads: 9/100 (2 idle, 0 queued) (202 total active threads) Queries in flight: 1 (0 queued)
2024-04-29 00:01:35,991 DEBUG middleware.log :: GET /api/health 200 150.0 µs (0 DB calls) App DB connections: 0/8 Jetty threads: 39/100 (3 idle, 0 queued) (267 total active threads) Queries in flight: 0 (0 queued)
2024-05-02 00:02:25,991 DEBUG middleware.log :: GET /api/health 200 153.4 µs (0 DB calls) App DB connections: 0/15 Jetty threads: 44/100 (2 idle, 0 queued) (283 total active threads) Queries in flight: 0 (0 queued)

Information about your Metabase installation

{
  "browser-info": {
    "language": "en-US",
    "platform": "MacIntel",
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:125.0) Gecko/20100101 Firefox/125.0",
    "vendor": ""
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.22+7",
    "java.vendor": "Eclipse Adoptium",
    "java.vendor.url": "https://adoptium.net/",
    "java.version": "11.0.22",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.22+7",
    "os.name": "Linux",
    "os.version": "5.4.0-177-generic",
    "user.language": "en",
    "user.timezone": "GMT"
  },
  "metabase-info": {
    "databases": [
      "googleanalytics",
      "postgres",
      "bigquery-cloud-sdk",
      "oracle"
    ],
    "hosting-env": "unknown",
    "application-database": "postgres",
    "application-database-details": {
      "database": {
        "name": "PostgreSQL",
        "version": "14.11 (Ubuntu 14.11-1.pgdg22.04+1)"
      },
      "jdbc-driver": {
        "name": "PostgreSQL JDBC Driver",
        "version": "42.6.0"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2024-03-04",
      "tag": "v0.48.8",
      "hash": "a900c85"
    },
    "settings": {
      "report-timezone": "Europe/Riga"
    }
  }
}

Severity

annoying

Additional context

No response

The text was updated successfully, but these errors were encountered:

paoliniluis · 2024-05-02T10:37:45Z

You need to tell us exactly what you do to use those threads so we can reproduce. Btw: why do you have that c3p0 config?

ky0uraku · 2024-05-02T11:22:41Z

You need to tell us exactly what you do to use those threads so we can reproduce. Btw: why do you have that c3p0 config?

We are not doing anything special - several datasources are connected to the Metabase (Postgres, Oracle), from which we take information for the dashboards that users work with.

idleConnectionTestPeriod was needed because network devices between the DBMS and Metabase sometimes unexpectedly broke network connections; maxIdleTime and unreturnedConnectionTimeout were added in an attempt to help with increasing number of threads.

ky0uraku · 2024-05-03T05:38:46Z

I also have dumps for different days - you can see the dynamics of increasing number of HelperThreads (Samurai, for example, can visualize the difference between several dumps):
metabase-20240427.dmp
metabase-20240501.dmp
metabase-20240503.dmp

Tony-metabase · 2024-05-07T09:31:07Z

Can you upgrade to our latest 49.8 please

Also when you say several datasources are connected to the Metabase can you quantify how many and what is the setting of the sync and scan for such Databases?

ky0uraku · 2024-05-07T12:16:34Z

Can you upgrade to our latest 49.8 please

Sure, will try it in a couple of days.

Also when you say several datasources are connected to the Metabase can you quantify how many and what is the setting of the sync and scan for such Databases?

Only one Oracle datasource is most actively used - hourly sync, scan disabled.

Other data sources (PostgreSQL - 4, Oracle - 4) - hourly sync, daily scan.

Tony-metabase · 2024-05-07T12:29:28Z

So you have 8 scans that happen every day right? Can you check how long those take? You can check the logs for FINISHED keywords and for the update-field-values

ky0uraku · 2024-05-07T14:10:17Z

Looks like I forgot some datasources. Here are the logs for the last 5 days:

cat metabase.log | grep FINISHED | grep update-field-values

2024-05-02 16:00:17,873 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (5.2 s)
2024-05-03 10:00:00,061 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (59.4 µs)
2024-05-03 11:03:12,844 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.1 mins)
2024-05-03 11:03:22,379 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.0 s)
2024-05-03 16:00:15,179 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (4.2 s)
2024-05-04 10:00:00,069 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (35.0 µs)
2024-05-04 11:03:35,279 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.2 mins)
2024-05-04 11:03:46,370 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.6 s)
2024-05-04 16:00:18,197 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (4.9 s)
2024-05-05 10:00:00,075 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (33.8 µs)
2024-05-05 11:03:29,756 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.2 mins)
2024-05-05 11:03:40,463 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.4 s)
2024-05-05 16:00:17,825 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (4.9 s)
2024-05-06 03:00:01,768 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 10 'DB Oracle Production (GDPR)' (580.6 ms)
2024-05-06 10:00:00,067 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (94.6 µs)
2024-05-06 11:03:37,061 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.3 mins)
2024-05-06 11:03:48,461 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.5 s)
2024-05-06 16:00:18,003 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (5.0 s)
2024-05-07 10:00:00,074 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (66.9 µs)
2024-05-07 11:03:39,783 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.3 mins)
2024-05-07 11:03:51,298 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.7 s)

ky0uraku · 2024-05-16T06:20:10Z

Can you upgrade to our latest 49.8 please

Hello. Sorry for the delay - updated to 49.10 this morning. We'll take a look at Jetty in a couple of days and I'll come back with information.

ky0uraku added .Needs Triage Type:Bug Product defects labels May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jetty threads starvation #42139

Jetty threads starvation #42139

ky0uraku commented May 2, 2024

paoliniluis commented May 2, 2024

ky0uraku commented May 2, 2024 •

edited

ky0uraku commented May 3, 2024

Tony-metabase commented May 7, 2024

ky0uraku commented May 7, 2024 •

edited

Tony-metabase commented May 7, 2024

ky0uraku commented May 7, 2024

ky0uraku commented May 16, 2024

Jetty threads starvation #42139

Jetty threads starvation #42139

Comments

ky0uraku commented May 2, 2024

Describe the bug

To Reproduce

Expected behavior

Logs

Information about your Metabase installation

Severity

Additional context

paoliniluis commented May 2, 2024

ky0uraku commented May 2, 2024 • edited

ky0uraku commented May 3, 2024

Tony-metabase commented May 7, 2024

ky0uraku commented May 7, 2024 • edited

Tony-metabase commented May 7, 2024

ky0uraku commented May 7, 2024

ky0uraku commented May 16, 2024

ky0uraku commented May 2, 2024 •

edited

ky0uraku commented May 7, 2024 •

edited