Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading preview image causes server crash #7897

Closed
2 tasks done
BoYs-J opened this issue May 16, 2024 · 7 comments
Closed
2 tasks done

Loading preview image causes server crash #7897

BoYs-J opened this issue May 16, 2024 · 7 comments
Labels
bug Something isn't working need info Need more information to investigate the issue

Comments

@BoYs-J
Copy link

BoYs-J commented May 16, 2024

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Steps to Reproduce

/api/jobs/20/preview?org=CDJLR
/api/tasks/20/preview?org=CDJLR

Expected Behavior

Loading preview images consumes too much CPU resources. If the preview images do not have cache resources and there are a large number of preview image requests, it can cause the server to crash instantly.

Server version: 2.13.0
Core version: 15.0.5
Canvas version: 2.20.1
UI version: 1.63.10

Possible Solution

If the data is being annotated and there are a large number of preview image requests, it is very likely that the annotation cannot be saved and lost. Suggested repair.

Context

preview
cpu

Environment

- Git hash commit (`git log -1`): No
- Docker version `docker version` (e.g. Docker 17.0.05): Version 24.0.4
- Are you using Docker Swarm or Kubernetes?: Docker Swarm
- Operating System and version (e.g. Linux, Windows, MacOS): Linux Ubuntu 22.04.2 LTS
- Code example or link to GitHub repo or gist to reproduce problem: No
- Other diagnostic information / logs: No
@BoYs-J BoYs-J added the bug Something isn't working label May 16, 2024
@bsekachev
Copy link
Member

  • Loading preview images consumes too much CPU resources
  • it can cause the server to crash instantly.

Why do you think the server may crash because of high CPU usage?
And what do you mean under "crash"? Server container restarted? uvicorn process killed?
I can't see any logs or evidence of the server crash on screenshots you provided.

@bsekachev bsekachev added the need info Need more information to investigate the issue label May 16, 2024
@BoYs-J
Copy link
Author

BoYs-J commented May 16, 2024

Strictly speaking, it caused the system to block, and it will only resume running after it is processed. During the blocking process, no system function can be used, including SSH links. The screenshot above was saved by me while SSH was still valid.

I'm not sure about the specific reason, I feel it's due to the preview image, and sometimes it takes a long time to load.
1715841903337

2024-05-16 06:06:53 To 2024-05-16 06:13:55 the system cannot be used. Here are some logs.

cvat_server.log

1715841389238

nginx_access.log

1715841424454

nginx_error.log

1715841440763

@bsekachev
Copy link
Member

More probably you run out of system memory and processes start using swap memory.
Swap usually extremely freezes the whole system.

@bsekachev
Copy link
Member

Try to run htop on the server, observe, and make a screenshot when system gets frozen.

@BoYs-J
Copy link
Author

BoYs-J commented May 16, 2024

The system resource usage is too high, and both CPU and memory are fully loaded.
Is querying preview images particularly resource intensive?

1715845074137

1715845107811

1715845199955

@bsekachev
Copy link
Member

Is querying preview images particularly resource intensive?

Not really.

You do not have enough hardware resources on your server to run CVAT and other servers (mongo db? something else?)
I can't see a lot of CPU usage on CVAT processes. It means, that memory is bottleneck. Try to increase it.

I would say the minimum is 8 Gb of RAM is necessary only for a CVAT server with low load (for system itself, docker, postgres, redis, keydb, clickhouse, nginx, and python processes).

@BoYs-J
Copy link
Author

BoYs-J commented May 16, 2024

The second image above is the htop of the CVAT query thumbnail, which has already consumed the remaining resources.

Running all docker image Memory usage 6.xGB.
Once I search for content on "/jobs? Page=1", loading thumbnails will cause memory to fill up.
Although there is not much remaining memory (2GB+), querying thumbnails causes the system to freeze, which I think is abnormal.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working need info Need more information to investigate the issue
Projects
None yet
Development

No branches or pull requests

2 participants