Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rest: too many open connections to database #571

Open
mdonadoni opened this issue Dec 21, 2022 · 3 comments
Open

rest: too many open connections to database #571

mdonadoni opened this issue Dec 21, 2022 · 3 comments

Comments

@mdonadoni
Copy link
Member

Many connections to the DB are opened by r-server when handling requests. These connections are not closed, probably because of connection pooling.

Given that there is a limit on the number of concurrent connections, it can happen that under heavy load some components are not able to reach the DB:

==> ERROR: Cannot create workflow :
(psycopg2.OperationalError) connection to server at "..." (...), port ... failed: FATAL:  remaining connection slots are reserved for non-replication superuser connections

How to reproduce:

  1. Deploy REANA locally in production mode

  2. Execute many workflows and send many requests to REANA, like so:

    $ for i in `seq 30`; do reana-client list &; done; wait
    $ for i in `seq 30`; do reana-client ls -w workflow &; done; wait
  3. Look at open connections (in this case 10.244.0.178 is r-server, while 10.244.0.5 is r-w-controller)

    reana=# select client_addr, count(*) from pg_stat_activity group by client_addr;
     client_addr  | count 
    --------------+-------
                  |     6
     10.244.0.178 |    46
     10.244.0.5   |     5
    (3 rows)
    
@mdonadoni
Copy link
Member Author

mdonadoni commented Jan 5, 2023

This issue is (at least partially) caused by Invenio and REANA using different connection pools to connect to the DB.

This can be seen easily by setting the following reana-server's environment variables in values.yaml:

@@ -78,6 +78,8 @@ components:
       REANA_RATELIMIT_GUEST_USER: "20 per second"
       REANA_RATELIMIT_AUTHENTICATED_USER: "20 per second"
       REANA_RATELIMIT_SLOW: "1/5 second"
+      SQLALCHEMY_POOL_SIZE: 1
+      SQLALCHEMY_MAX_OVERFLOW: 0
     uwsgi:
       processes: 6
       threads: 4

In this case, the pool size used by REANA and Invenio is different, which means that different connection pools are being used:

$ kubectl -it exec deployment/reana-server -c rest-api -- invenio shell
In [1]: from invenio_db import db

In [2]: db.engine.pool.size()
Out[2]: 5

In [3]: from reana_db import database

In [4]: database.engine.pool.size()
Out[4]: 1

Note that Invenio uses Flask-SQLAlchemy extension to connect to the DB, so it is possible to modify Invenio's pool size by setting SQLALCHEMY_POOL_SIZE = 1 in reana_server/config.py.

At the moment I don't see which are the benefits of using different connection pools. Instead, there a few drawbacks to this approach: not only it increases the number of connections open with the DB, but it is also possible that multiple connections to the DB are opened to serve a single request. This also makes it impossible to modify REANA and Invenio tables in the same transaction.

@mdonadoni
Copy link
Member Author

mdonadoni commented Jan 5, 2023

This is the number of connections depending on the number of uWSGI processes and threads.

With REANA's and Invenio's pool size equal to 1:

Processes Threads Connections
3 4 7 (3 REANA + 3 Invenio + 1 scheduler)
3 8 7 (3 + 3 + 1)
6 4 13 (6 + 6 + 1)
6 8 13 (6 + 6 + 1)

With REANA's pool size set to 1 and Invenio's pool size equal to 5:

Processes Threads Connections
3 4 16 (3 REANA + 3*4 Invenio + 1 scheduler)
3 8 19 (3 + 3*5 + 1)
6 4 31 (6 + 6*4 + 1)
6 8 37 (6 + 6*5 + 1)

Connections are shared between threads but not between processes. For this reason, it is not very useful to have a pool size bigger than the number of threads per process, as each thread will open one connection and none of them will be shared (e.g. with pool size equal to 5 and considering 4 threads, only 4 connections will be opened).

@mdonadoni
Copy link
Member Author

Note that REANA now supports pgBouncer, so this issue has become less urgent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant