Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetty threads starvation #42139

Open
ky0uraku opened this issue May 2, 2024 · 8 comments
Open

Jetty threads starvation #42139

ky0uraku opened this issue May 2, 2024 · 8 comments
Labels

Comments

@ky0uraku
Copy link

ky0uraku commented May 2, 2024

Describe the bug

Hello.

I encountered the following problem in Metabase - during operation the number of busy jetty threads gradually increases, after which at some point the application stops responding to healthcheck probes:
Grafana

At the same time, the number of HelperThreads in the thread dump increases. Here is an application dump with uptime for about a week:
metabase-20240502.dmp

Adding maxIdleTime and unreturnedConnectionTimeout parameters does not help. Updating the version from 0.45 to 0.48 also does not help.

Current configuration:

MB_JDBC_DATA_WAREHOUSE_MAX_CONNECTION_POOL_SIZE: 45
MB_JETTY_MAXTHREADS: 100

c3p0.idleConnectionTestPeriod=60
c3p0.maxIdleTime=300
c3p0.unreturnedConnectionTimeout=3600

To Reproduce

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

The number of busy Jetty threads does not increase over time.

Logs

2024-04-26 09:31:08,696 INFO metabase.core :: Starting Metabase version v0.48.8 (a900c85)
2024-04-27 00:00:55,212 DEBUG middleware.log :: GET /api/health 200 158.6 µs (0 DB calls) App DB connections: 0/10 Jetty threads: 9/100 (2 idle, 0 queued) (202 total active threads) Queries in flight: 1 (0 queued)
2024-04-29 00:01:35,991 DEBUG middleware.log :: GET /api/health 200 150.0 µs (0 DB calls) App DB connections: 0/8 Jetty threads: 39/100 (3 idle, 0 queued) (267 total active threads) Queries in flight: 0 (0 queued)
2024-05-02 00:02:25,991 DEBUG middleware.log :: GET /api/health 200 153.4 µs (0 DB calls) App DB connections: 0/15 Jetty threads: 44/100 (2 idle, 0 queued) (283 total active threads) Queries in flight: 0 (0 queued)

Information about your Metabase installation

{
  "browser-info": {
    "language": "en-US",
    "platform": "MacIntel",
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:125.0) Gecko/20100101 Firefox/125.0",
    "vendor": ""
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.22+7",
    "java.vendor": "Eclipse Adoptium",
    "java.vendor.url": "https://adoptium.net/",
    "java.version": "11.0.22",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.22+7",
    "os.name": "Linux",
    "os.version": "5.4.0-177-generic",
    "user.language": "en",
    "user.timezone": "GMT"
  },
  "metabase-info": {
    "databases": [
      "googleanalytics",
      "postgres",
      "bigquery-cloud-sdk",
      "oracle"
    ],
    "hosting-env": "unknown",
    "application-database": "postgres",
    "application-database-details": {
      "database": {
        "name": "PostgreSQL",
        "version": "14.11 (Ubuntu 14.11-1.pgdg22.04+1)"
      },
      "jdbc-driver": {
        "name": "PostgreSQL JDBC Driver",
        "version": "42.6.0"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2024-03-04",
      "tag": "v0.48.8",
      "hash": "a900c85"
    },
    "settings": {
      "report-timezone": "Europe/Riga"
    }
  }
}

Severity

annoying

Additional context

No response

@ky0uraku ky0uraku added .Needs Triage Type:Bug Product defects labels May 2, 2024
@paoliniluis
Copy link
Contributor

You need to tell us exactly what you do to use those threads so we can reproduce. Btw: why do you have that c3p0 config?

@ky0uraku
Copy link
Author

ky0uraku commented May 2, 2024

You need to tell us exactly what you do to use those threads so we can reproduce. Btw: why do you have that c3p0 config?

We are not doing anything special - several datasources are connected to the Metabase (Postgres, Oracle), from which we take information for the dashboards that users work with.

idleConnectionTestPeriod was needed because network devices between the DBMS and Metabase sometimes unexpectedly broke network connections; maxIdleTime and unreturnedConnectionTimeout were added in an attempt to help with increasing number of threads.

@ky0uraku
Copy link
Author

ky0uraku commented May 3, 2024

I also have dumps for different days - you can see the dynamics of increasing number of HelperThreads (Samurai, for example, can visualize the difference between several dumps):
metabase-20240427.dmp
metabase-20240501.dmp
metabase-20240503.dmp

@Tony-metabase
Copy link
Contributor

Can you upgrade to our latest 49.8 please

Also when you say several datasources are connected to the Metabase can you quantify how many and what is the setting of the sync and scan for such Databases?

@ky0uraku
Copy link
Author

ky0uraku commented May 7, 2024

Can you upgrade to our latest 49.8 please

Sure, will try it in a couple of days.

Also when you say several datasources are connected to the Metabase can you quantify how many and what is the setting of the sync and scan for such Databases?

Only one Oracle datasource is most actively used - hourly sync, scan disabled.

Other data sources (PostgreSQL - 4, Oracle - 4) - hourly sync, daily scan.

@Tony-metabase
Copy link
Contributor

So you have 8 scans that happen every day right? Can you check how long those take? You can check the logs for FINISHED keywords and for the update-field-values

image

@ky0uraku
Copy link
Author

ky0uraku commented May 7, 2024

Looks like I forgot some datasources. Here are the logs for the last 5 days:

cat metabase.log | grep FINISHED | grep update-field-values

2024-05-02 16:00:17,873 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (5.2 s)
2024-05-03 10:00:00,061 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (59.4 µs)
2024-05-03 11:03:12,844 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.1 mins)
2024-05-03 11:03:22,379 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.0 s)
2024-05-03 16:00:15,179 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (4.2 s)
2024-05-04 10:00:00,069 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (35.0 µs)
2024-05-04 11:03:35,279 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.2 mins)
2024-05-04 11:03:46,370 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.6 s)
2024-05-04 16:00:18,197 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (4.9 s)
2024-05-05 10:00:00,075 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (33.8 µs)
2024-05-05 11:03:29,756 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.2 mins)
2024-05-05 11:03:40,463 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.4 s)
2024-05-05 16:00:17,825 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (4.9 s)
2024-05-06 03:00:01,768 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 10 'DB Oracle Production (GDPR)' (580.6 ms)
2024-05-06 10:00:00,067 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (94.6 µs)
2024-05-06 11:03:37,061 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.3 mins)
2024-05-06 11:03:48,461 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.5 s)
2024-05-06 16:00:18,003 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 5 'GA DB' (5.0 s)
2024-05-07 10:00:00,074 INFO sync.util :: FINISHED: step 'update-field-values' for googleanalytics Database 4 'GA DB Mobile' (66.9 µs)
2024-05-07 11:03:39,783 INFO sync.util :: FINISHED: step 'update-field-values' for bigquery-cloud-sdk Database 11 'DB BigQuery' (1.3 mins)
2024-05-07 11:03:51,298 INFO sync.util :: FINISHED: step 'update-field-values' for oracle Database 8 'DB Oracle Stage' (3.7 s)

@ky0uraku
Copy link
Author

Can you upgrade to our latest 49.8 please

Hello. Sorry for the delay - updated to 49.10 this morning. We'll take a look at Jetty in a couple of days and I'll come back with information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants