Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISABLED test_n_threads (__main__.TestOpenMP_ParallelFor) #125364

Open
pytorch-bot bot opened this issue May 2, 2024 · 2 comments
Open

DISABLED test_n_threads (__main__.TestOpenMP_ParallelFor) #125364

pytorch-bot bot opened this issue May 2, 2024 · 2 comments
Assignees
Labels
high priority module: flaky-tests Problem is a flaky test in CI module: unknown We do not know who is responsible for this feature, bug, or test case. oncall: pt2 skipped Denotes a (flaky) test currently skipped in CI. triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@pytorch-bot
Copy link

pytorch-bot bot commented May 2, 2024

Platforms: dynamo

This test was disabled because it is failing in CI. See recent examples and the most recent trunk workflow logs.

Over the past 3 hours, it has been determined flaky in 6 workflow(s) with 6 failures and 6 successes.

Debugging instructions (after clicking on the recent samples link):
DO NOT ASSUME THINGS ARE OKAY IF THE CI IS GREEN. We now shield flaky tests from developers so CI will thus be green but it will be harder to parse the logs.
To find relevant log snippets:

  1. Click on the workflow logs linked above
  2. Click on the Test step of the job so that it is expanded. Otherwise, the grepping will not work.
  3. Grep for test_n_threads
  4. There should be several instances run (as flaky tests are rerun in CI) from which you can study the logs.
Sample error message
Traceback (most recent call last):
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 173, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 515, in transform
    tracer.run()
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2230, in run
    super().run()
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 880, in run
    while self.step():
          ^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 795, in step
    self.dispatch_table[inst.opcode](self, inst)
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 552, in wrapper
    speculation.fail_and_restart_analysis()
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 146, in fail_and_restart_analysis
    raise exc.SpeculationRestartAnalysis(restart_reason=restart_reason)
torch._dynamo.exc.SpeculationRestartAnalysis

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/jenkins/workspace/test/test_openmp.py", line 62, in test_n_threads
    def test_n_threads(self):
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 979, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 820, in _convert_frame
    result = inner_convert(
             ^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 411, in _convert_frame_assert
    return _compile(
           ^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_utils_internal.py", line 70, in wrapper_function
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 701, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 273, in time_wrapper
    r = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 568, in compile_inner
    out_code = transform_code_object(code, transform)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1116, in transform_code_object
    transformations(instructions, code_options)
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 190, in _fn
    guards.check()
AssertionError: Global num_threads state changed while dynamo tracing, please report a bug


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True


To execute this test, run the following from the base repo dir:
    PYTORCH_TEST_WITH_DYNAMO=1 python test/test_openmp.py -k test_n_threads

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

Test file path: test_openmp.py

cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @clee2000 @bdhirsh @anijain2305 @chauhang

@pytorch-bot pytorch-bot bot added module: flaky-tests Problem is a flaky test in CI module: unknown We do not know who is responsible for this feature, bug, or test case. oncall: pt2 skipped Denotes a (flaky) test currently skipped in CI. labels May 2, 2024
Copy link
Author

pytorch-bot bot commented May 2, 2024

Hello there! From the DISABLED prefix in this issue title, it looks like you are attempting to disable a test in PyTorch CI. The information I have parsed is below:
  • Test name: test_n_threads (__main__.TestOpenMP_ParallelFor)
  • Platforms for which to skip the test: dynamo
  • Disabled by pytorch-bot[bot]

Within ~15 minutes, test_n_threads (__main__.TestOpenMP_ParallelFor) will be disabled in PyTorch CI for these platforms: dynamo. Please verify that your test name looks correct, e.g., test_cuda_assert_async (__main__.TestCuda).

To modify the platforms list, please include a line in the issue body, like below. The default action will disable the test for all platforms if no platforms list is specified.

Platforms: case-insensitive, list, of, platforms

We currently support the following platforms: asan, dynamo, inductor, linux, mac, macos, rocm, slow, win, windows.

@ezyang ezyang added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed triage review labels May 15, 2024
@ezyang
Copy link
Contributor

ezyang commented May 15, 2024

This consistently fails for me. @jansel, you added this assert, PTAL

jansel added a commit that referenced this issue May 18, 2024
Fixes #125364

ghstack-source-id: 22c5609845479767eaa1d3ec8d991cc0ea78e2fe
Pull Request resolved: #126623
jansel added a commit that referenced this issue May 19, 2024
Fixes #125364

ghstack-source-id: 95c1cbdaaec386904f785f394fe86a7aa163170a
Pull Request resolved: #126623
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority module: flaky-tests Problem is a flaky test in CI module: unknown We do not know who is responsible for this feature, bug, or test case. oncall: pt2 skipped Denotes a (flaky) test currently skipped in CI. triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

2 participants