Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run against SWE-Bench #1062

Closed
batwood001 opened this issue Mar 13, 2024 · 1 comment
Closed

Run against SWE-Bench #1062

batwood001 opened this issue Mar 13, 2024 · 1 comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request

Comments

@batwood001
Copy link

Feature description

Devin, the "First AI software engineer" is using their SWE-Bench performance as primary evidence of their capability.

Motivation/Application

Provide a benchmark to evaluate Devin's claims, and solidify gpt-engineer's reputation as a legitimate autonomous coding agent.

@batwood001 batwood001 added enhancement New feature or request triage Interesting but stale issue. Will be close if inactive for 3 more days after label added. labels Mar 13, 2024
@ErikBjare
Copy link
Collaborator

I also had the urge to implement this after the Devin announcement 😄

But there is already an issue for it: #913

Closing as duplicate, continuing the discussion in the above issue.

@ErikBjare ErikBjare closed this as not planned Won't fix, can't repro, duplicate, stale Mar 13, 2024
@ErikBjare ErikBjare added duplicate This issue or pull request already exists and removed triage Interesting but stale issue. Will be close if inactive for 3 more days after label added. labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants