Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoGen meets SWE_bench #2933

Closed
wants to merge 11 commits into from
Closed

AutoGen meets SWE_bench #2933

wants to merge 11 commits into from

Conversation

skzhang1
Copy link
Collaborator

@skzhang1 skzhang1 commented Jun 13, 2024

Why are these changes needed?

This PR provides a basic implementation for SWE_bench benchmark (https://www.swebench.com). This PR is still on working and not ready for review.

  • Specific packages pre-requirement
  • Documentations (notebook, blogpost)
  • Clean the codes.

🌹 Acknowledgement: The code is greatly on the basis of swe-agent.

Please add more @Hk669 if you have other suggestions. @Hk669 will mainly take response for it based on this draft PR.

Related issue number

N/A

Checks

@skzhang1 skzhang1 marked this pull request as draft June 13, 2024 06:05
Copy link

gitguardian bot commented Jul 20, 2024

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@codecov-commenter
Copy link

codecov-commenter commented Aug 18, 2024

Codecov Report

Attention: Patch coverage is 0% with 284 lines in your changes missing coverage. Please review.

Project coverage is 19.97%. Comparing base (6279247) to head (03d1afa).
Report is 91 commits behind head on main.

Files Patch % Lines
autogen/agentchat/contrib/swebench_agent.py 0.00% 263 Missing ⚠️
autogen/agentchat/contrib/swebench_utils.py 0.00% 21 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2933       +/-   ##
===========================================
- Coverage   32.90%   19.97%   -12.94%     
===========================================
  Files          94       97        +3     
  Lines       10235    10953      +718     
  Branches     2193     2509      +316     
===========================================
- Hits         3368     2188     -1180     
- Misses       6580     8614     +2034     
+ Partials      287      151      -136     
Flag Coverage Δ
unittests 19.93% <0.00%> (-12.98%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ekzhu
Copy link
Collaborator

ekzhu commented Oct 2, 2024

@skzhang1 would you like to continue working on this draft PR?

@jackgerrits jackgerrits added the 0.2 Issues which are related to the pre 0.4 codebase label Oct 4, 2024
@skzhang1
Copy link
Collaborator Author

skzhang1 commented Oct 5, 2024

@ekzhu Hi, I may not working on it recently. This PR provides a basic implementation and it is workable. I hope other people in the community could further work on it based on this PR. @Hk669

@ekzhu
Copy link
Collaborator

ekzhu commented Oct 5, 2024

Thanks. Do you think we can close this and revisit the idea once the new version is merged? It can be part of agbench

@Hk669
Copy link
Contributor

Hk669 commented Oct 6, 2024

Thanks. Do you think we can close this and revisit the idea once the new version is merged? It can be part of agbench

Sure @ekzhu , we can revisit after the new version. agbench

@ekzhu ekzhu closed this Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.2 Issues which are related to the pre 0.4 codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants