-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to .npy datasets #1
Comments
Hello preritt, Thanks for your interest in the benchmark. If you would like to download the entire benchmark at once to access the raw .npy files, they are available at the following gcp bucket: https://github.com/rail-berkeley/design-bench/blob/new-api/design_bench/disk_resource.py#L7 This post may be of interest if you are not familiar with gsutil: https://stackoverflow.com/questions/58581873/how-to-download-an-entire-bucket-in-gcp Generally speaking, the dataset files are downloaded as needed from gcp when design_bench.make is called. Could you share the full script producing the error, and the full stack trace? Warm regards, |
Hi Brandon, Sorry for the delayed response.
This is the error
I'll try the GCP method and get back in case of error. Thank you so much for your response! |
Could you try calling https://github.com/rail-berkeley/design-bench/blob/new-api/design_bench/__init__.py#L809 For example, |
I tried the following:
|
Could you check which version number of the benchmark you have installed? |
It is 2.0.12
|
The latest version of the benchmark is 2.0.20, could you try that version? |
I have the correct version now:
Not sure why, but now I get an import error when using:
|
Ah, this can happen if an incompatible version of deepchem is installed. Can you try installing the version of deepchem listed here: https://github.com/brandontrabucco/design-baselines/blob/master/requirements.txt#L29 I'm not sure if that's the only package that may need an update, so perhaps check the whole requirements file. |
Thanks a lot! I did a pip install on the requirements and it resolved the issue. |
Hi,
Thank you for releasing the package!
I wanted to check the procedure to access the offline datasets. It seems these are not part of the repo. I am not sure if I am missing something.
For example, I get the following error when using
task = design_bench.make('ChEMBL-ResNet-v0')
FileNotFoundError: [Errno 2] No such file or directory:
/chembl-GI50-CHEMBL1964047/chembl-y-2.npy'
Thank you!
The text was updated successfully, but these errors were encountered: