[Self-Host] The CPU and memory resources are occupied too high, and after stopping the crawling task, the resources cannot be reduced. #722

zhiweijie · 2024-10-01T02:47:01Z

Describe the Issue
I deployed the V1 version on a VPS server, and it started running well at first, but after a while, the system resources would be overloaded, causing the firecrawl worker to stop accepting crawling tasks. I need to restart the container again to make it work normally. Is there something wrong with my configuration?

To Reproduce
Steps to reproduce the issue:

Start the container
Continuously fetch about 30 tasks
Server memory from 13% to 85%
Server memory cannot be restored to 13% after fetch job finished

Expected Behavior
Expected 2 cores and 4GB of resources should be sufficient.

Screenshots

Environment (please complete the following information):

OS: Debian OS 10
Firecrawl Version: V1.0.0
Node.js Version: -
Docker Version (if applicable): 26.1.4
Database Type and Version: --

Configuration
Only modified the default port, the rest is default configuration.

`name: firecrawl

x-common-service: &common-service
build: apps/api
networks:
- backend
environment:
- REDIS_URL=${REDIS_URL:-redis://redis:6379}
- REDIS_RATE_LIMIT_URL=${REDIS_URL:-redis://redis:6379}
- PLAYWRIGHT_MICROSERVICE_URL=${PLAYWRIGHT_MICROSERVICE_URL:-http://playwright-service:8721}
- USE_DB_AUTHENTICATION=${USE_DB_AUTHENTICATION}
- PORT=${PORT:-8722}
- NUM_WORKERS_PER_QUEUE=${NUM_WORKERS_PER_QUEUE}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENAI_BASE_URL=${OPENAI_BASE_URL}
- MODEL_NAME=${MODEL_NAME:-gpt-4o}
- SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL}
- LLAMAPARSE_API_KEY=${LLAMAPARSE_API_KEY}
- LOGTAIL_KEY=${LOGTAIL_KEY}
- BULL_AUTH_KEY=${BULL_AUTH_KEY}
- TEST_API_KEY=${TEST_API_KEY}
- POSTHOG_API_KEY=${POSTHOG_API_KEY}
- POSTHOG_HOST=${POSTHOG_HOST}
- SUPABASE_ANON_TOKEN=${SUPABASE_ANON_TOKEN}
- SUPABASE_URL=${SUPABASE_URL}
- SUPABASE_SERVICE_TOKEN=${SUPABASE_SERVICE_TOKEN}
- SCRAPING_BEE_API_KEY=${SCRAPING_BEE_API_KEY}
- HOST=${HOST:-0.0.0.0}
- SELF_HOSTED_WEBHOOK_URL=${SELF_HOSTED_WEBHOOK_URL}
- LOGGING_LEVEL=${LOGGING_LEVEL}
extra_hosts:
- "host.docker.internal:host-gateway"

services:
playwright-service:
build: apps/playwright-service
environment:
- PORT=8721 #
- PROXY_SERVER=${PROXY_SERVER}
- PROXY_USERNAME=${PROXY_USERNAME}
- PROXY_PASSWORD=${PROXY_PASSWORD}
- BLOCK_MEDIA=${BLOCK_MEDIA}
networks:
- backend

api:
<<: *common-service
depends_on:
- redis
- playwright-service
ports:
- "8722:8722"
command: [ "pnpm", "run", "start:production" ]

worker:
<<: *common-service
depends_on:
- redis
- playwright-service
- api
command: [ "pnpm", "run", "workers" ]

redis:
image: redis:alpine
networks:
- backend
command: redis-server --bind 0.0.0.0

networks:
backend:
driver: bridge
`

nickscamara · 2024-10-03T21:21:05Z

@mogery can you look into this when u get a chance?

timkley · 2024-12-16T08:29:16Z

We use a similar setup, playwright seems to hog all resources sometimes (Only happened to my using the crawl endpoint). I can see about 20 processes with htop but they never seem to finish. Could be like a missing timeout causing the browsers to never terminate?

Our VPS also has 2 vCPU and 4GB of RAM, setup looks nearly identical to OP.

Let me know if additional logs would be helpful.

zhiweijie added the self-host label Oct 1, 2024

nickscamara assigned mogery Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Self-Host] The CPU and memory resources are occupied too high, and after stopping the crawling task, the resources cannot be reduced. #722

[Self-Host] The CPU and memory resources are occupied too high, and after stopping the crawling task, the resources cannot be reduced. #722

zhiweijie commented Oct 1, 2024

nickscamara commented Oct 3, 2024

timkley commented Dec 16, 2024

[Self-Host] The CPU and memory resources are occupied too high, and after stopping the crawling task, the resources cannot be reduced. #722

[Self-Host] The CPU and memory resources are occupied too high, and after stopping the crawling task, the resources cannot be reduced. #722

Comments

zhiweijie commented Oct 1, 2024

nickscamara commented Oct 3, 2024

timkley commented Dec 16, 2024