Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 dataset access #243

Closed
robmarkcole opened this issue May 2, 2024 · 3 comments
Closed

S3 dataset access #243

robmarkcole opened this issue May 2, 2024 · 3 comments

Comments

@robmarkcole
Copy link

Hi
I understand the dataset can be streamed from S3, following the example in the docs I get an error, and assume access must be granted?

 > aws s3 ls s3://clay-tiles-02/02/27WXN/

An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
@brunosan
Copy link
Member

brunosan commented May 7, 2024

Hi Rob!!
I think the right move here is to copy a representative sample of embeddings to source.coop

I don't think if it makes sense to publicly host a copy of the whole training set publicly, when is just a cropped selection of data already available. E.g. on v1 we have 50M chips and we are anyways moving towards streaming from source COGs into the GPUs on training. https://github.com/Clay-foundation/stacchip

In the meantime I've just activated requester pays on this bucket.

@brunosan brunosan closed this as completed May 7, 2024
@robmarkcole
Copy link
Author

@brunosan I get an error:

⚡ ~/Clay-Foundation-Model aws s3 ls s3://clay-tiles-02/02/27WXN/ --request-payer requester

An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

@brunosan brunosan reopened this May 8, 2024
@yellowcap
Copy link
Member

@robmarkcole for Clay v1 we do not recommend using these datacubes anymore. The input ca be generated much more flexible and adapted to the use case. As described in the following tutorial.

https://clay-foundation.github.io/model/tutorials/clay-v1-wall-to-wall.html

Please let us know if we can help you with testing Clay v1, happy to advise on data preparation for your use case if you have questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants