-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce concurrency when using glibc-based Linux without jemalloc to help prevent memory fragmentation #2607
Comments
Hi, did you see #955? The summary is that (1) RSS includes memory that has been freed but not yet returned to the OS and (2) the memory allocator is ultimately responsible for doing this. A different OS/allocator/configuration makes all the difference. |
Just tried running the same script with ↪ env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 node test-memory.js
100 ticks, 0 errors, heap 8.81MB, rss 178.95MB⏎ ``` |
You should probably put it somewhere in readme. Ideally in bold letters. |
I'm glad to hear this information helped you solve your problem within 30 minutes of you asking the question. Yes, we can add something to the docs about this - I'll leave this issue open to track it. The default memory allocator on most glibc-based Linux is unsuitable for long-running processes that involve lots of memory allocations and usage spikes. This is independent of sharp and libvips, which are compatible with all memory allocators. Given macOS, musl-based Linux such as Alpine and even dare I say it Windows manages to avoid this class of problem, I hope you'll agree it is equally incumbent on operating systems to ensure their memory allocators return freed memory. |
@lovell I just re-checked and the script I posted still consumes a lot of memory even with |
Rather than simply document this, I've gone a step further. Commit e6380fd reduces the default concurrency for glibc-based Linux users that are not using jemalloc, which should help with memory fragmentation. This will be in v0.28.0. A reduced concurrency might have a slight performance impact. For small images things may be a bit faster as there's a lower threadpool set-up cost, whereas larger images that benefit from concurrency could be a bit slower. The advice to select an OS and/or memory allocator designed for long-running, multi-threaded processes remains. |
I think it would be great to put this information somewhere on the Installation page. People should know the benefits of selecting the right OS and/or memory allocator for sharp (or, more accurately, for long-running, multi-threaded processes). |
Good idea, thank you, added via commit d69c58a |
v0.28.0 now available with dynamic concurrency based on the detected memory allocator, plus updated installation docs. https://sharp.pixelplumbing.com/install#linux-memory-allocator |
Sharp by default spawns #cpu_cores threads to process images which could lead to large memory consumption. This PR caps the `concurrency` value especially on dev. Locally I see a small memory improvement (10~20 MB) from this, but it will mostly affect long-running servers. Related: lovell/sharp#2607 Co-authored-by: Steven <[email protected]>
Reviving that old thread: is there a reason for vips to not link to jemalloc at build time ? |
@kapouer Please see lovell/sharp-libvips#95 for a future possible enhancement that relates to this. |
After migrating from
gm
tosharp
we noticed a strange behaviour of our image processing servers. They memory consumption seemed inadequate and constantly growing.At first we suspected a memory lead in our JS code, but after some research it turned out that all this memory was consumed by either
sharp
orlibvips
. And it's not, strictly speaking, a leak, but rather a very strange and greedy memory allocation.Using node.js 14 partially negates the issue, but it's still there.
Are you using the latest version? Is the version currently in use as reported by
npm ls sharp
the same as the latest version as reported bynpm view sharp dist-tags.latest
?Yes,
[email protected]
What are the steps to reproduce?
Start processing some large images concurrently and memory consumption will grow very rapidly.
At some point memory consumption will reach it's limit and will stay there. But increasing the size of processed images will result in further memory allocation.
What is the expected behaviour?
sharp
should limit it's memory usage with some sane thresholdsharp
should free allocated memory whet it's no longer neededAre you able to provide a minimal, standalone code sample, without other dependencies, that demonstrates this problem?
Here is a simple code example to illustrate strange memory consumption behaviour:
Here is a result of running this script on my system using node.js v12.18.1:
You could use any
outputFormat
. I usedraw
because it leaks memory faster than compressed formats.At first I thought that the issue is related to
.toBuffer
internal buffering, but using files or streams will result in exactly the same memory consumption behaviour.Here is an example of running this script with
jpeg
format and an infinite total ticks threshold:Memory consumption stops somewhere around this value, so it's not an actual leak.
Doing the same with
raw
format results in node.js being killed by OOM killer after consuming all available memory.Are you able to provide a sample image that helps explain the problem?
Any large image will do, here is the one I used in my tests:
What is the output of running
npx envinfo --binaries --system
?But the same behaviour could be reproduced using node docker container.
I reproduced aforementioned issue using
node:8
,node:10
andnode:12
containers.With
node:14
script output seems fine, but memory consumption is still high. And memory is still not released.But at least with
node:14
memory consumption lo longer goes through the roof.The text was updated successfully, but these errors were encountered: