Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Memory Usage with Large Text Files #39

Open
monteiz opened this issue Mar 3, 2023 · 1 comment
Open

High Memory Usage with Large Text Files #39

monteiz opened this issue Mar 3, 2023 · 1 comment

Comments

@monteiz
Copy link

monteiz commented Mar 3, 2023

Hello,
I am experiencing high memory usage issues while using the npm package mmap-object for analyzing large files. The library takes too much memory, twice or for larger files even three or four times.

I have created a public GitHub repository where I have uploaded a test application to demonstrate the issue, available at https://github.com/monteiz/mmap-object-memory-issue. The text file used in the test application is ASCII encoded and should take only one byte for each character.

Here are the relevant logs produced by the test application:

Text file size: 40 MiB
Analysing master random_strings.txt on pid 14118...
Lines count: 107957

Master file for random_strings.txt successfully analysed.

Processed byte: 40 MiB
RSS: 91 MiB
HeapTotal: 12 MiB
HeapUsed: 5.5 MiB
External: 2.2 MiB
ArrayBuffers: 1.2 MiB

As shown in the logs, the analyzed file size is 40 MiB, but the memory usage is much higher, with the RSS being 91 MiB.

Expected behavior

The mmap-object library should not use more memory than necessary when analyzing large files. It should be able to handle large files without requiring excessive amounts of memory.

Actual behavior

The mmap-object library is using significantly more memory than expected, with memory usage often two or three times the size of the analyzed file.


I believe this is a significant issue for anyone working with large files, and I hope it can be resolved soon.

@monteiz
Copy link
Author

monteiz commented Mar 11, 2023

Thanks to @sehe suggestion I realised that my case could be treated more specifically.

I decided then to create a new package called shared-file-view. It's different from mmap-object. To begin it's readonly, while mmap-object has read/write capabilities.

Anyway, for the same file, my memory usage is now as follows:

Processed byte: 40 MiB
RSS: 47 MiB
HeapTotal: 4.8 MiB
HeapUsed: 3.0 MiB
External: 1.0 MiB
ArrayBuffers: 9.7 KiB

Hope it helps someone. Thanks @allenluce for sharing this project with the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant