Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: spatial: serialize concurrent calls to QHull #20619

Merged
merged 7 commits into from
May 28, 2024

Conversation

andfoy
Copy link
Contributor

@andfoy andfoy commented Apr 30, 2024

Reference issue

Closes gh-8856

What does this implement/fix?

This PR introduces a Mutex mechanism to serialize QHull calls, thus preventing potential segfaults or undefined behaviour, since it is not thread-safe. See http://www.qhull.org/html/qh-code.htm#reentrant

Additional information

Depends on #20611

@github-actions github-actions bot added scipy.spatial Cython Issues with the internal Cython code base labels Apr 30, 2024
@andfoy andfoy changed the title Lock qhull ENH: Serialize concurrent calls to QHull Apr 30, 2024
Copy link

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too bad about the code churn but putting the lock cleanup in a finally block does make sense.

scipy/spatial/_qhull.pyx Show resolved Hide resolved
scipy/spatial/_qhull.pyx Outdated Show resolved Hide resolved
@ngoldbaum
Copy link

Probably worth benchmarking singlethreaded performance since acquiring an uncontested PyThread_type_lock does have some overhead.

@rgommers
Copy link
Member

rgommers commented May 1, 2024

Great, looks like this fixes a long-standing crash in LinearNDInterpolator! The one failure is memory allocation in the 32-bit job:

numpy.core._exceptions._ArrayMemoryError: Unable to allocate 1.49 GiB for an array with shape (6250000, 32) and data type float64

Could you skip this test on 32-bit platforms and when not enough memory is available @andfoy? Search the code base for _IS_32BIT and check_free_memory to see the patterns for how.

Also, could you add @pytest.mark.slow? This test takes a while to run, so we don't need to do so in every CI job.

@rgommers rgommers added the free-threading Items related to supporting free-threaded (a.k.a. "no-GIL") builds of CPython label May 1, 2024
@andfoy
Copy link
Contributor Author

andfoy commented May 1, 2024

Thanks for the pointers! @ngoldbaum @rgommers @ev-br, I'll make sure to fix the tests and some minor lock acquisition details

@rgommers
Copy link
Member

rgommers commented May 3, 2024

The diff with whitespace hidden looks quite clean now. Probably about ready to go once the merge conflict is resolved?

@rgommers rgommers mentioned this pull request May 3, 2024
10 tasks
@lucascolley lucascolley changed the title ENH: Serialize concurrent calls to QHull ENH: spatial: serialize concurrent calls to QHull May 5, 2024
@andfoy andfoy marked this pull request as ready for review May 6, 2024 14:22
def test_threading(self):
# This test was taken from issue 8856
# https://github.com/scipy/scipy/issues/8856
check_free_memory(30000)
Copy link
Member

@rgommers rgommers May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you really mean to set the limit to 30 GB? That means it won't be run on machine with 32 GB RAM - i.e., almost never.

I seem to be needing about 10 GB total:

$ ulimit -Sv 10000000
$ python dev.py test -t scipy.interpolate.tests.test_interpnd::TestLinearNDInterpolation::test_threading -m full

It crashes with 8 GB on my machine. I guess that makes sense, since the 32-bit test showed that a single array creation needed about 1.5 GB, and there are four threads here. So the test needs 6 GB, and then there's the memory taken by SciPy itself, other tests, and dependencies.

Ideally I think we set the limit to 10 GB or so, and tweak the test a bit so that it requires 0.5 GB per thread. That way it'll run for most contributors.

Could you please check if the test still reproducibly crashes when you slightly lower the array sizes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check it again with lower bounds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the lowest values for which we can reliably reproduce the crash case imply a memory consumption of ~5GB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems fine - that'll mean it runs on most developer machines. Let's call this good.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in, seems like everything is working now. Thanks @andfoy!

@rgommers rgommers merged commit 69cd453 into scipy:main May 28, 2024
29 of 31 checks passed
@rgommers rgommers added this to the 1.14.0 milestone May 28, 2024
@andfoy andfoy deleted the lock_qhull branch May 28, 2024 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cython Issues with the internal Cython code base free-threading Items related to supporting free-threaded (a.k.a. "no-GIL") builds of CPython scipy.spatial
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LinearNDInterpolator not thread safe
5 participants