-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine-tuning fails after installation from source #393
Comments
@devon-research can you please install from src, it works on my end. BTW we did some refactor recently would be great to pull the latest first. We are planning a release soon. The HDSP, device_mesh was added recently not present in binaries yet. |
Hi! It seems that a solution has been provided to the issue and there has not been a follow-up conversation for a long time. I will close this issue for now and feel free to reopen it if you have any questions! |
System Info
I use the base Docker image pytorch/pytorch. I then run
Information
Code to reproduce the bug
Error logs
Other notes
Note that running
python -c "import torch; print(torch.__version__)"
yields2.1.2+cu118
. Furthermore, the output ofpip install --extra-index-url https://download.pytorch.org/whl/test/cu118 -e llama-recipes[tests,auditnlg,vllm]
involves uninstalling the latest PyTorch version (2.2.1) from the base image and installing an older version.My understanding from the relevant PyTorch release notes is that the
device_mesh
abstraction (which is the cause of the original error above) is introduced intotorch.distributed
only in PyTorch 2.2. However, therequirements.txt
here inllama-recipes
only specifiestorch>=2.0.1
.Unfortunately, simply changing the requirement to
torch>=2.2
results in an error when installingllama-recipes
:This error does not occur if the only change I make is to revert
2.2
to2.0.1
in therequirements.txt
file.Workaround
A workaround is to simply run
after installing
llama-recipes
.The text was updated successfully, but these errors were encountered: