Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPP extention] Baton lock is called regardless the code version #125404

Conversation

daniil-lyakhov
Copy link
Contributor

@daniil-lyakhov daniil-lyakhov commented May 2, 2024

Greetings!

Fixes #125403

Please assist me with the testing as it is possible for my reproducer to miss the error in the code. Several (at least two) threads should enter the same part of the code at the same time to check file lock is actually working

cc @jbschlosser

Copy link

pytorch-bot bot commented May 2, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125404

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ab9d264 with merge base 8046de3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

linux-foundation-easycla bot commented May 2, 2024

CLA Signed

  • ✅login: daniil-lyakhov / (ab9d264)

The committers listed above are authorized under a signed CLA.

@daniil-lyakhov daniil-lyakhov changed the title [CPP extention] Baton lock is called when regardless code version [CPP extention] Baton lock is called regardless the code version May 2, 2024
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fix_dataparallel_extension_building branch from a555097 to ab9d264 Compare May 2, 2024 16:45
@drisspg drisspg added module: cpp Related to C++ API module: multithreading Related to issues that occur when running on multiple CPU threads module: ddp Issues/PRs related distributed data parallel training triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 3, 2024
@ezyang
Copy link
Contributor

ezyang commented May 3, 2024

Is it possible to put your reproducer in the test suite?

@ezyang
Copy link
Contributor

ezyang commented May 3, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 3, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@ezyang ezyang added release notes: cpp release notes category topic: bug fixes topic category labels May 3, 2024
@ezyang
Copy link
Contributor

ezyang commented May 3, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@daniil-lyakhov
Copy link
Contributor Author

Is it possible to put your reproducer in the test suite?

Hello there! Should I open a separate PR with the test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpp Related to C++ API module: ddp Issues/PRs related distributed data parallel training module: multithreading Related to issues that occur when running on multiple CPU threads open source release notes: cpp release notes category topic: bug fixes topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Could not jit compile custom extension in dataparallel mode
5 participants