Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounding Box Detection #123

Open
batu opened this issue Dec 19, 2024 · 0 comments
Open

Bounding Box Detection #123

batu opened this issue Dec 19, 2024 · 0 comments

Comments

@batu
Copy link

batu commented Dec 19, 2024

Hello!

After seeing that sonnet is trained for computer use (with exact pixel coordinates) I tried using it for bounding box detection (both open vocab with text input, or few-shot with image input). However, my results have been worse than I expected given claude's performance with computer use. I tried following the best practices outlined in this repo.

My question to you is:

  1. Can you share what specific normalization/origin location is claude for computer use trained for? So I can use the same set up.
  2. Any bb grounding related suggestions I should try beyond what is given in the cookbooks.

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant