Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional heap size limit would be very useful #2294

Open
triska opened this issue Jan 12, 2024 · 10 comments
Open

Optional heap size limit would be very useful #2294

triska opened this issue Jan 12, 2024 · 10 comments

Comments

@triska
Copy link
Contributor

triska commented Jan 12, 2024

Prolog applications and test cases may cause Scryer Prolog to allocate a lot of memory to store terms on the WAM heap. Even more so now, while garbage collection is not yet implemented. When no more memory can be allocated, the system currently simply crashes instead of throwing Prolog exceptions that can be caught and reasoned about in programs (#16).

It is my understanding that "out of memory" conditions can currently not be reliably or easily caught due to shortcomings in Rust itself. However, it would already help tremendously if Scryer Prolog provided a way to dynamically configure an optional heap size limit, for example via a command line option or system-specific fact, as in:

$ scryer-prolog --heap-size=100_000

This example would make Scryer act as if only 100_000 bytes were available for the heap, and throw an exception when the heap grows beyond this size. It is not necessary to check this limit after every WAM instruction or every logical inference. It would suffice to check this limit every N inferences (N to be defined), which can be checked when the global inference count is incremented (see 54166b9). Adding such a check for the WAM heap seems much easier than catching the case of the true (as opposed to the virtual) machine running out of memory.

In addition, there are currently terms that are not allocated on the heap, notably strings and large integers. Ideally, their size is also taken into account as long as they are not allocated on the heap. I hope that this additional complication may soon become irrelevant, when these data structures are also allocated on the heap for faster memory reclamation on backtracking and also other efficiency reasons.

If anyone is interested in this issue, please have a look, I would greatly appreciate your help with this! Thank you a lot!

@UWN
Copy link

UWN commented Jan 12, 2024

There must be a way to do this in rust when even ps and top know this

@bakaq
Copy link
Contributor

bakaq commented Jan 12, 2024

Wow, I was just working on something similar as a stepping stone to dealing with actual out of memory errors in the heap (#16). Currently the heap is just a Vec<HeapCellValue>, so we have access to try_reserve() which allows to check for out of memory errors. However, most methods like push() implicitly allocate if needed. Ideally we would have something like try_push(). This is not in the standard library yet, but it's not that difficult to implement with try_reserve().

For this and more reasons, I think that Heap should be a separate struct, not just a type alias for Vec<HeapCellValue>. This would add some complexity, but would also make it easier to add this "artificial limit" to the heap (just compare with the limit before trying try_reserve() in the relevant methods). However, I think this is a more "long term" approach, and just checking every N iterations seems to be much simpler.

@bakaq
Copy link
Contributor

bakaq commented Jan 12, 2024

There must be a way to do this in rust when even ps and top know this

We could just do what they do, which is reading /proc/<pid>/stat (we could actually use /proc/self/stat too). That is very simple, and can see the memory reserved for the whole program, but only works on Linux. Or we could use getrusage(), but that also seems dubiously portable. This would error on memory for the whole program, but we could put the sizes of the heap, stack and similar into the second argument of error/2 to help diagnose what caused hitting the limit.

@UWN
Copy link

UWN commented Jan 13, 2024

ulrich@gupu:/opt/gupu/scryer-prolog$ sh -c 'ulimit -v 100000; ./target/release/scryer-prolog -f'
?- use_module(library(lists)).
   true.
?- M is 10^7,length(L,M).
memory allocation of 67108864 bytes failed
Aborted (core dumped)

Who writes this message memory allocation of 67108864 bytes failed? It seems this is rust.

@bakaq
Copy link
Contributor

bakaq commented Jan 13, 2024

Who writes this message memory allocation of 67108864 bytes failed? It seems this is rust.

It's the Rust global allocator. It doesn't give a fallible interface, it just shows this message and aborts in this case. Should we also abort after the resource error when we catch this case? I think it seems appropriate, because even the toplevel may not work if there is no more memory.

@triska
Copy link
Contributor Author

triska commented Jan 13, 2024

Throwing an exception resets all heap allocations and therefore should return to the same heap state encountered when the query was initiated.

@triska
Copy link
Contributor Author

triska commented Jan 13, 2024

To clarify: This item can be completely solved by reasoning about the virtual machine state of the WAM as implemented by Scryer Prolog, it is a matter of checking whether the WAM register H (top of the heap) is larger than a user-definable (integer) bound. This can be solved without checking, limiting or retracting any allocations of actual memory that Rust itself performs.

The entire point of this issue is to obtain a Prolog exception so that "out-of-WAM-memory" conditions can be handled in Prolog applications. It is never acceptable for Scryer Prolog to crash.

@UWN
Copy link

UWN commented Jan 13, 2024

The major source of overflows in Prolog is non-termination of completely correct predicates in certain use cases. So the code is not wrong, just the query is a bit too general. Like

?- append([_|L],_,L).
memory allocation of 67108864 bytes failed
Aborted (core dumped)

compare this to SICStus:

| ?- append([_|L],_,L).
! Resource error: insufficient memory
| ?- 

Which one makes a better impression?

@bakaq
Copy link
Contributor

bakaq commented Jan 13, 2024

Would it also be good to have this limit by default, and maybe dynamic? Maybe we could check how much memory is available in the system once in a while and put the limit to 80 or 90% of that. Even just putting the default limit to something like 2GB would be helpful I think, most basic Prolog programs don't need that much memory, and if you need you could just increase the limit. Either of these would give default behavior close to the SICStus example above.

@UWN
Copy link

UWN commented Jan 13, 2024

Don't overthink it. I use ulimit -v for SICStus. No more options needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants