Implement caching to avoid a ton of requests to crowdsec api #21

aleksandarmomic · 2022-03-03T14:04:49Z

Currently every request gets forwarded to crowdsec one by one and it is slow and resource intensive. In my setup I have setup mariadb additionally and calling crowdsec on every request results in a call to the db. All this can be avoided with a single json file with cached ip addresses on the bouncers side. Similar to how cloudflare bouncer is caching them.
This also results in pretty big mariadb binary logs.
Simple cache mechanism would save space and increase performance by having less impact on the system. File based caching (like json) would be enough, but redis would be awesome.

fbonalair · 2022-03-04T14:29:42Z

Hi, thanks for the suggestion.

Yes, caching would definitely be a nice feature, plus the data are quite suitable to be cached.

I'm more concerned about the cache duration, since the bouncer does not own the data, nor can be notified on change.
I guess the ban duration is a first step? Though, a scanner would be allowed for far too long. Maybe also a cache eviction with a number or call?

About the cache location, I'm thinking of memory first: can't be faster and easier to garbage collect in case of bug. Lastly, disk IO are a pain that I don't want to launch myself into...
Second would be Redis, well known and battle tested.

What do you think?

tinolin · 2022-03-04T17:55:04Z

Hello! I leave what I think:

CacheTime: Configurable environment variable
Redis is the best for that.

aleksandarmomic · 2022-03-06T15:05:54Z

Hi, thanks for the suggestion.

Yes, caching would definitely be a nice feature, plus the data are quite suitable to be cached.

I'm more concerned about the cache duration, since the bouncer does not own the data, nor can be notified on change. I guess the ban duration is a first step? Though, a scanner would be allowed for far too long. Maybe also a cache eviction with a number or call?

About the cache location, I'm thinking of memory first: can't be faster and easier to garbage collect in case of bug. Lastly, disk IO are a pain that I don't want to launch myself into... Second would be Redis, well known and battle tested.

What do you think?

@fbonalair
With a quick look at the crowdsec api values returned from the decisions endpoint include the decision duration and the until per one decision. I believe those can be used to control the cache lifetime. This looks pretty promising for Redis as a cache as it is possible to control the lifetime per ip which is not the case for the in-memory or file based caching as you would have to control the deletion of expired decisions.

fbonalair · 2022-03-08T20:36:23Z

@fbonalair With a quick look at the crowdsec api values returned from the decisions endpoint include the decision duration and the until per one decision. I believe those can be used to control the cache lifetime.

That is what I was thinking of using for cache lifetime. Though, I'm still worried about first offenders getting unlimited access for the cache duration.
I guess while looking for a solution I will put the caching system as non default with a warning.

Hello! I leave what I think:
1. CacheTime: Configurable environment variable

Depending on the caching solution parameters, some will be available through environment variables. Thanks for the suggestion.

el-joseppe · 2022-08-26T09:07:21Z

Any updates to this issue?

I had to shut down my traefik-crowdsec-bouncer. My server would randomly went unresponsive even over ssh because the bouncer-container got unstable and jamed up the cpu. My initial guess was also that this is caused by some sort of overload issue with to many requests and therefore to many calls to the crowdsec-LAPI via the bouncer-middleware.

I read some of the decumentation over at crowdsec an found, that the official nginx-bouncer has two operation modes:

Live mode (query the local API for each request (like the traefik-crowdsec-bouncer, right?))
Stream mode (pull the local API for new/old decisions every X seconds; in combination with a cache and a configurable CACHE_EXPIRATION parameter)

That sounds like a solid solution in my estimation. Wouldn't that be also beneficial for the traefik-bouncer, especially in a more demanding environment or with limited ressources.

mathieuHa · 2022-09-16T16:08:23Z

Hello,

I've been following this project for a while and wanted to contribute somehow.

I've implemented a local cache using the library go-cache

It is configurable using 2 environnement variables:

CROWDSEC_BOUNCER_ENABLE_LOCAL_CACHE - Configure the use of a local cache in memory. Default to false
CROWDSEC_DEFAULT_CACHE_DURATION - Configure default duration of the cached data. Default to "4h00m00s"

When the cache is enabled, the first time an IP has to be checked, it is first looked up in the local cache.
This can produce 2 outcomes:

the IP was found (was it considered malicious or not ?) -> we can continue without asking crowdsec
the IP was not found -> we have to ask crowdsec and cache the result after the first request

Cache invalidation is provided by the library, a background job will remove from the cache every entry which are not valid anymore.
This background job runs every 5 min (could be configured), and the default cache validity is 4h and can be overrided by using CROWDSEC_DEFAULT_CACHE_DURATION.

I've got some idea on how to implement a redis configurable version as well to mix cache with the streaming mode which could greatly improve performance.

What do you think about this ?
@el-joseppe
@fbonalair

mathieuHa · 2022-09-18T12:27:09Z

Hello,

I've been following this project for a while and wanted to contribute somehow.

I've implemented a local cache using the library go-cache

It is configurable using 2 environnement variables:

CROWDSEC_BOUNCER_ENABLE_LOCAL_CACHE - Configure the use of a local cache in memory. Default to false

CROWDSEC_DEFAULT_CACHE_DURATION - Configure default duration of the cached data. Default to "4h00m00s"

When the cache is enabled, the first time an IP has to be checked, it is first looked up in the local cache. This can produce 2 outcomes:

the IP was found (was it considered malicious or not ?) -> we can continue without asking crowdsec

the IP was not found -> we have to ask crowdsec and cache the result after the first request

Cache invalidation is provided by the library, a background job will remove from the cache every entry which are not valid anymore. This background job runs every 5 min (could be configured), and the default cache validity is 4h and can be overrided by using CROWDSEC_DEFAULT_CACHE_DURATION.

I've got some idea on how to implement a redis configurable version as well to mix cache with the streaming mode which could greatly improve performance.

What do you think about this ? @el-joseppe @fbonalair

I've just finished working on the streaming mode, it works pretty well.

At the start it takes all known banned IP and cache it in local-cache, and then every minute local cache is updated with the new information only.
I used the robfig/cron library for the recurrent job

It can also be configured with env variables:

CROWDSEC_LAPI_ENABLE_STREAM_MODE - Enable streaming mode to pull decisions from the LAPI. Will override CROWDSEC_BOUNCER_ENABLE_LOCAL_CACHE and enable it. Default to "true"
CROWDSEC_LAPI_STREAM_MODE_INTERVAL - Define the interval between two calls to LAPI. Default to "1m"

I've took the liberty to enable it by default.
any feedback appreciated @fbonalair

fbonalair · 2022-09-18T20:48:14Z

I took the liberty only to review the #33 PR since it's written to be based on the #32.
Anyway, many thanks for the work! I have put some comments as reviews.

About the default mode, couple of thoughts:

It is a breaking change, the service won't have the same comportment than before. For people not fixing their container version, I prefer avoiding the "unwanted" change.
Strictly speaking, the stream interval let a unknown malicious user free to do whatever during that time. I prefer users making the choice knowing this drawback.
I would prefer to wait for the return from users before it being the default mode. Defensive with security I am.

To prepare for a redis cache or other caches, I would be nice to externalize the cache logic into it's own file / service / folder. And depending on user configuration, the right one would initialized in bouncer.go . It was my rough start in branch feat/cache.
Though it's not mandatory in a first cache implementation.

fbonalair self-assigned this Mar 4, 2022

mathieuHa mentioned this issue Sep 16, 2022

Implement local cache for crowdsec #32

Open

mossroy mentioned this issue Jan 3, 2024

Error Message : context deadline exceeded (Client.Timeout exceeded while awaiting headers) #42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement caching to avoid a ton of requests to crowdsec api #21

Implement caching to avoid a ton of requests to crowdsec api #21

aleksandarmomic commented Mar 3, 2022 •

edited

fbonalair commented Mar 4, 2022

tinolin commented Mar 4, 2022 •

edited

aleksandarmomic commented Mar 6, 2022

fbonalair commented Mar 8, 2022

el-joseppe commented Aug 26, 2022

mathieuHa commented Sep 16, 2022

mathieuHa commented Sep 18, 2022

fbonalair commented Sep 18, 2022

Implement caching to avoid a ton of requests to crowdsec api #21

Implement caching to avoid a ton of requests to crowdsec api #21

Comments

aleksandarmomic commented Mar 3, 2022 • edited

fbonalair commented Mar 4, 2022

tinolin commented Mar 4, 2022 • edited

aleksandarmomic commented Mar 6, 2022

fbonalair commented Mar 8, 2022

el-joseppe commented Aug 26, 2022

mathieuHa commented Sep 16, 2022

mathieuHa commented Sep 18, 2022

fbonalair commented Sep 18, 2022

aleksandarmomic commented Mar 3, 2022 •

edited

tinolin commented Mar 4, 2022 •

edited