Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a succinct, "consumer grade" benchmark (or document how to run one from the current selection) #190

Open
RJVB opened this issue Aug 8, 2023 · 3 comments
Assignees

Comments

@RJVB
Copy link

RJVB commented Aug 8, 2023

Would it be possible to add a succinct benchmark where "mere users" can compare the performance benefits of mimalloc and some other common alternative allocators against the standard/system allocator, in a more or less representative workload? Or document how to run one from the available code?

An option to build against the installed allocators would be great too!

@mjp41
Copy link
Collaborator

mjp41 commented Aug 8, 2023

Would it be possible to add a succinct benchmark where "mere users" can compare the performance benefits of mimalloc and some other common alternative allocators against the standard/system allocator, in a more or less representative workload? Or document how to run one from the available code?

This suite was design by @daanx to be used to find problems with allocators rather than a consumer facing pick you allocator.

That said, there are several benchmarks that are more representative of realistic workloads in the suite

  • redis
  • rocksdb
  • gs
  • lua
  • cfrac

Running those will tell you how well those applications might perform. If they are close to your workload, then they might inform your choice.

There are other aspects of allocator performance not tested by these benchmarks:

  • long term fragmentation
  • massive memory footprints
  • latency profiles
  • memory usage over time (not just peak)

An option to build against the installed allocators would be great too!

You can specify

  ../../bench.sh --external=file.txt [set of benchmarks]

where file.txt contains lines of the form

allocator_name /path/to/allocator/library.so

This will then run with those allocators. For example, I just ran

../../bench.sh --external=alloc-16-32.txt allt -r=40

where alloc-16-32.txt

sn-16 /home/mjp41/snmalloc/release/libsnmallocshim-16.so
sn-32 /home/mjp41/snmalloc/release/libsnmallocshim-32.so

@RJVB
Copy link
Author

RJVB commented Aug 9, 2023

Thanks, with that additional information I've been able to cobble up something for macports! (https://github.com/RJVB/macstrop/blob/678f0708981ce92f76a6b6154a75ed3bdb3eb4de/devel/mimalloc/Portfile#L61)

@RJVB
Copy link
Author

RJVB commented Aug 10, 2023

FWIW, I see very little performance benefit in terms of timing in real-world tasks, even in a python/sqlite benchmark of operations on an in-memory database. What's more: the system allocator (Mac and Linux) seems to be the most economical in terms of total memory used, page faults and context switches (if I interpret the output from GNU time -v correctly).

Fortunately my main interest for using a malloc & family replacement has always been in hope that they'd give freed memory back to the system more aggressively and thus lead to less swap space usage and/or fragmentation. Jemalloc has an optional background thread that takes care of that; it's much less clear if and how the others do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants