Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce concurrency when using glibc-based Linux without jemalloc to help prevent memory fragmentation #2607

Closed
lbeschastny opened this issue Mar 4, 2021 · 11 comments

Comments

@lbeschastny
Copy link

lbeschastny commented Mar 4, 2021

After migrating from gm to sharp we noticed a strange behaviour of our image processing servers. They memory consumption seemed inadequate and constantly growing.
At first we suspected a memory lead in our JS code, but after some research it turned out that all this memory was consumed by either sharp or libvips. And it's not, strictly speaking, a leak, but rather a very strange and greedy memory allocation.

Using node.js 14 partially negates the issue, but it's still there.

Are you using the latest version? Is the version currently in use as reported by npm ls sharp the same as the latest version as reported by npm view sharp dist-tags.latest?

Yes, [email protected]

What are the steps to reproduce?

Start processing some large images concurrently and memory consumption will grow very rapidly.

At some point memory consumption will reach it's limit and will stay there. But increasing the size of processed images will result in further memory allocation.

What is the expected behaviour?

  • sharp should limit it's memory usage with some sane threshold
  • sharp should free allocated memory whet it's no longer needed

Are you able to provide a minimal, standalone code sample, without other dependencies, that demonstrates this problem?

Here is a simple code example to illustrate strange memory consumption behaviour:

const sharp = require('sharp');

const filePath = '4ce2474a893e1e02e5f342808fe618a7.jpg';
const outputFormat = 'raw'; // raw format leaks memory faster
const parallelTasks = 10;
const totalTicks = 100;

const bytesToMb = (bytes) => (bytes / Math.pow(1024, 2)).toFixed(2);

// disable internal cache to increase mamory leaking speed
sharp.cache(false);

let pendingCount = 0;
let ticksCount = 0;
let errorsCount = 0;

const tick = async (trials = 0) => {
	pendingCount += 1;

	try {
		await sharp(filePath).toFormat(outputFormat).toBuffer();
	} catch (err) {
		console.log(err);
		errorsCount += 1;
	} finally {
		pendingCount -= 1;
		ticksCount += 1;

		const {rss, heapTotal} = process.memoryUsage();

		process.stdout.clearLine();
		process.stdout.cursorTo(0);
		process.stdout.write(
			`${ticksCount} ticks, ${errorsCount} errors, heap ${bytesToMb(heapTotal)}MB, rss ${bytesToMb(rss)}MB`
		);

		if (ticksCount + pendingCount < totalTicks) {
			setTimeout(tick, 0);
		}
	}
}

Array(parallelTasks).fill().map(()=>{
  tick();
});

Here is a result of running this script on my system using node.js v12.18.1:

↪ node test-memory.js
100 ticks, 0 errors, heap 8.48MB, rss 2105.86MB

You could use any outputFormat. I used raw because it leaks memory faster than compressed formats.

At first I thought that the issue is related to .toBuffer internal buffering, but using files or streams will result in exactly the same memory consumption behaviour.

Here is an example of running this script with jpeg format and an infinite total ticks threshold:

↪ node test-memory.js
1606 ticks, 0 errors, heap 5.48MB, rss 1406.85MB

Memory consumption stops somewhere around this value, so it's not an actual leak.

Doing the same with raw format results in node.js being killed by OOM killer after consuming all available memory.

Are you able to provide a sample image that helps explain the problem?

Any large image will do, here is the one I used in my tests:

sample image

What is the output of running npx envinfo --binaries --system?

  System:
    OS: Linux 5.8 Ubuntu 20.04.2 LTS (Focal Fossa)
    CPU: (8) x64 Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
    Memory: 11.44 GB / 15.35 GB
    Container: Yes
    Shell: 5.0.17 - /bin/bash
  Binaries:
    Node: 12.18.1 - ~/.nvm/versions/node/v12.18.1/bin/node
    Yarn: 1.22.4 - ~/.nvm/versions/node/v12.18.1/bin/yarn
    npm: 6.14.5 - ~/.nvm/versions/node/v12.18.1/bin/npm

But the same behaviour could be reproduced using node docker container.

I reproduced aforementioned issue using node:8, node:10 and node:12 containers.

With node:14 script output seems fine, but memory consumption is still high. And memory is still not released.
But at least with node:14 memory consumption lo longer goes through the roof.

@lbeschastny lbeschastny changed the title sharp consumes a lot of memory and never releases it unless node@14 is used sharp consumes a lot of memory and never releases it Mar 4, 2021
@lovell
Copy link
Owner

lovell commented Mar 4, 2021

Hi, did you see #955? The summary is that (1) RSS includes memory that has been freed but not yet returned to the OS and (2) the memory allocator is ultimately responsible for doing this. A different OS/allocator/configuration makes all the difference.

@lovell lovell added question and removed triage labels Mar 4, 2021
@lbeschastny
Copy link
Author

Just tried running the same script with libjemalloc.so.2 and it helped

↪ env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 node test-memory.js
100 ticks, 0 errors, heap 8.81MB, rss 178.95MB⏎                                                                                                                                                 ```

@lbeschastny
Copy link
Author

lbeschastny commented Mar 4, 2021

You should probably put it somewhere in readme. Ideally in bold letters.
It seems sharp is currently incompatible with default memory allocator on Debian and CentOS systems.

@lovell
Copy link
Owner

lovell commented Mar 4, 2021

I'm glad to hear this information helped you solve your problem within 30 minutes of you asking the question.

Yes, we can add something to the docs about this - I'll leave this issue open to track it.

The default memory allocator on most glibc-based Linux is unsuitable for long-running processes that involve lots of memory allocations and usage spikes. This is independent of sharp and libvips, which are compatible with all memory allocators. Given macOS, musl-based Linux such as Alpine and even dare I say it Windows manages to avoid this class of problem, I hope you'll agree it is equally incumbent on operating systems to ensure their memory allocators return freed memory.

@lovell lovell changed the title sharp consumes a lot of memory and never releases it Document importance of memory allocator selection for long running processes on glibc-based Linux Mar 4, 2021
@lbeschastny
Copy link
Author

lbeschastny commented Mar 4, 2021

@lovell I just re-checked and the script I posted still consumes a lot of memory even with jemalloc.
But it works great for normal output formats which is all I really need.

@lovell
Copy link
Owner

lovell commented Mar 10, 2021

Rather than simply document this, I've gone a step further. Commit e6380fd reduces the default concurrency for glibc-based Linux users that are not using jemalloc, which should help with memory fragmentation. This will be in v0.28.0.

A reduced concurrency might have a slight performance impact. For small images things may be a bit faster as there's a lower threadpool set-up cost, whereas larger images that benefit from concurrency could be a bit slower.

The advice to select an OS and/or memory allocator designed for long-running, multi-threaded processes remains.

@lovell lovell added this to the v0.28.0 milestone Mar 10, 2021
@lovell lovell changed the title Document importance of memory allocator selection for long running processes on glibc-based Linux Reduce concurrency when using glibc-based Linux without jemalloc to help prevent memory fragmentation Mar 10, 2021
This was referenced Mar 10, 2021
@lbeschastny
Copy link
Author

The advice to select an OS and/or memory allocator designed for long-running, multi-threaded processes remains.

I think it would be great to put this information somewhere on the Installation page.

People should know the benefits of selecting the right OS and/or memory allocator for sharp (or, more accurately, for long-running, multi-threaded processes).

@lovell
Copy link
Owner

lovell commented Mar 18, 2021

Good idea, thank you, added via commit d69c58a

@lovell
Copy link
Owner

lovell commented Mar 29, 2021

v0.28.0 now available with dynamic concurrency based on the detected memory allocator, plus updated installation docs.

https://sharp.pixelplumbing.com/install#linux-memory-allocator

@lovell lovell closed this as completed Mar 29, 2021
kodiakhq bot pushed a commit to vercel/next.js that referenced this issue Aug 11, 2023
Sharp by default spawns #cpu_cores threads to process images which could lead to large memory consumption. This PR caps the `concurrency` value especially on dev.

Locally I see a small memory improvement (10~20 MB) from this, but it will mostly affect long-running servers.

Related: lovell/sharp#2607

Co-authored-by: Steven <[email protected]>
@kapouer
Copy link
Contributor

kapouer commented Aug 3, 2024

Reviving that old thread: is there a reason for vips to not link to jemalloc at build time ?

@kleisauke
Copy link
Contributor

@kapouer Please see lovell/sharp-libvips#95 for a future possible enhancement that relates to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants