Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disabling the uop optimizer leads to crashes #127083

Open
brandtbucher opened this issue Nov 21, 2024 · 3 comments
Open

Disabling the uop optimizer leads to crashes #127083

brandtbucher opened this issue Nov 21, 2024 · 3 comments
Labels
3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-JIT type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@brandtbucher
Copy link
Member

brandtbucher commented Nov 21, 2024

Crash report

When running:

PYTHON_UOPS_OPTIMIZE=0 PYTHON_LLTRACE=1 ./python -m test test_multiprocessing_fork.test_processes -m test_free_from_gc

On a debug build, we get an assertion failure:

Using random seed: 2033438694
0:00:00 load avg: 1.57 Run 1 test sequentially in a single process
0:00:00 load avg: 1.57 [1/1] test_multiprocessing_fork.test_processes
Created a proto-trace for _path_join (<frozen importlib._bootstrap_external>:131) at byte offset 110 -- length 29
Created a proto-trace for _path_join (<frozen importlib._bootstrap_external>:131) at byte offset 116 -- length 13
Created a proto-trace for Heap._free_pending_blocks (/home/brandtbucher/cpython/Lib/multiprocessing/heap.py:258) at byte offset 126 -- length 482
Created a proto-trace for _TestHeap.test_free_from_gc (/home/brandtbucher/cpython/Lib/test/_test_multiprocessing.py:4114) at byte offset 486 -- length 27
Created a proto-trace for BufferWrapper.__init__ (/home/brandtbucher/cpython/Lib/multiprocessing/heap.py:326) at byte offset 0 -- length 516
python: Python/optimizer.c:1277: uop_optimize: Assertion `length <= UOP_MAX_TRACE_LENGTH' failed.
Fatal Python error: Aborted

Current thread 0x00007f6a0dc56740 (most recent call first):
  File "/home/brandtbucher/cpython/Lib/multiprocessing/heap.py", line 326 in __init__
  File "__init__", line ??? in __init__
  File "/home/brandtbucher/cpython/Lib/test/_test_multiprocessing.py", line 4130 in test_free_from_gc
  File "/home/brandtbucher/cpython/Lib/unittest/case.py", line 606 in _callTestMethod
  File "/home/brandtbucher/cpython/Lib/unittest/case.py", line 660 in run
Fatal Python error: Segmentation fault

Segmentation fault (core dumped)

On a non-debug build, we get a buffer overflow:

Using random seed: 1659198442
0:00:00 load avg: 13.83 Run 1 test sequentially in a single process
0:00:00 load avg: 13.83 [1/1] test_multiprocessing_fork.test_processes
*** stack smashing detected ***: terminated
Fatal Python error: Aborted

Current thread 0x00007f11f9a27740 (most recent call first):
  File "/home/brandtbucher/cpython/Lib/multiprocessing/heap.py", line 326 in __init__
  File "__init__", line ??? in __init__
  File "/home/brandtbucher/cpython/Lib/test/_test_multiprocessing.py", line 4130 in test_free_from_gc
  File "/home/brandtbucher/cpython/Lib/unittest/case.py", line 606 in _callTestMethod
  File "/home/brandtbucher/cpython/Lib/unittest/case.py", line 660 in run
  File Fatal Python error: Segmentation fault

Segmentation fault (core dumped)

It looks like the optimizer just happens to do enough useful work that adding error and exit stubs doesn't put us over UOP_MAX_TRACE_LENGTH anymore. My hunch is that the real bug is in translate_bytecode_to_trace, where we're not reserving enough space for some instruction(s).

CC @Fidget-Spinner and @markshannon.

@brandtbucher brandtbucher added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump 3.14 new features, bugs and security fixes topic-JIT labels Nov 21, 2024
@JeffersGlass
Copy link
Contributor

For what it's worth, there seem to be a number of tests which cause this failure. Below is a list of tests, and the final UOp that fails to be added, with the message No room for (UOP) (need X, got Y).

Note that a few of the tests don't seem to show this kind of error, but instead underflow the trace stack.

Test UOp Failure
test.test_multiprocessing_fork.test_processes LOAD_FAST (81)
test.test_multiprocessing_forkserver.test_processes [track stack underflow]
test.test_multiprocessing_spawn.test_processes _EXIT_TRACE (300)
test_buffer _LOAD_CONST_IMMORTAL (207)
test_complex COMPARE_OP_FLOAT(185)
test_configparser LOAD_ATTR_METHOD_WITH_VALUES (200)
test_datetime STORE_FAST (107)
test_decimal [track stack underflow]
test_itertools _POP_TOP (29)
test_long BINARY_OP (42)
test_math CALL_KW_BOUND_METHOD (170)
test_plistlib LOAD_FAST (81)
test_pow LOAD_ATTR_METHOD_NO_DICT (199)
test_slice _TO_BOOL_BOOL (220)
test_struct CALL_METHOD_DESCRIPTOR_FAST (175)
test_tarfile _LOAD_CONST (79)
test_venv [trace stack underflow]
test_zipfile POP_JUMP_IF_TRUE (98)
test_zipimport [trace stack underflow]
test_zipimport_support CALL_KW_BOUND_METHOD (170)
test_zoneinfo CALL_PY_EXACT_ARGS (180)

@JeffersGlass
Copy link
Contributor

JeffersGlass commented Nov 21, 2024

Here's a simpler (if slightly cursed) reproducer:

# foo.py

def foo():
    # Padding to increase number of UOps in trace
    _ = reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(
            reversed(list(reversed(list(reversed(list(reversed(list(reversed(list(         
                [0]
            ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
            ))))))))))))))))))))))))))))))))))))))))

for _ in range(4096): #Minimum value
    foo()
./configure --enable-experimental-jit --with-pydebug
make
PYTHON_UOPS_OPTIMIZE=0 PYTHON_LLTRACE=1 ./python foo.py
Created a proto-trace for <module> (/home/jglass/Documents/cpython/foo.py:1) at byte offset 44 -- length 590
python: Python/optimizer.c:1287: uop_optimize: Assertion `length <= UOP_MAX_TRACE_LENGTH' failed.
Aborted (core dumped)

@brandtbucher
Copy link
Member Author

Thanks @JeffersGlass!

I talked with @markshannon, and we think it's probably worth reworking translate_bytecode_to_trace to be less error-prone. The problem is that we need to reserve enough space for the next bytecode's uop expansion before translating it, which is difficult to get right and difficult to check. Rather, we should be able to gracefully handle the case where we run out of space in the middle of translating an instruction, by reverting to a known safepoint (the last complete expansion) and ending the trace there.

Here's some pseudocode to capture the general idea:

buffer = [None] * MAX_TRACE_LENGTH
space_remaining = MAX_TRACE_LENGTH
last_complete = 0
index = 0

def add_uop(uop):
    if space_remaining <= 0:
        raise OutOfSpace
    buffer[index] = uop
    space_remaining -= 1
    index += 1

# Save room for final _CHECK_VALIDITY_AND_SET_IP + _EXIT_TRACE:
space_remaining -= 2
add_uop(_START_EXECUTOR)
add_uop(_MAKE_WARM)
try:
    for instruction in bytecode:
        if has_error(instruction):
            # Save room for error stub:
            space_remaining -= 1
        if has_exit(instruction):
            # Save room for exit stub:
            space_remaining -= 1
        add_uop(_CHECK_VALIDITY_AND_SET_IP)
        for uop in uop_expansion(instruction):
            add_uop(uop)
        # Successfully translated the entire instruction:
        last_complete = index - 1
except OutOfSpace:
    index = last_complete
add_uop(_CHECK_VALIDITY_AND_SET_IP)
add_uop(_EXIT_TRACE)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-JIT type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

2 participants