Skip to content

Pull requests: openai/evals

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[eval] Add IMO problems with exact answers
#1528 opened May 15, 2024 by justinlinw Loading…
13 tasks done
Dependabot configuration to update actions in workflows
#1526 opened May 1, 2024 by ScottBrenner Loading…
3 tasks done
show evals in wandb weave
#1522 opened Apr 19, 2024 by yogeshg Draft
13 tasks
Added Quran Eval & Simple Fact Model-Graded Definition
#1511 opened Apr 1, 2024 by sakher Loading…
13 tasks done
Add Classification Rule Articulation Eval
#1510 opened Mar 30, 2024 by danesherbs Loading…
13 tasks done
eval pattern-concat-logic
#1508 opened Mar 28, 2024 by natanaelwf Loading…
13 tasks done
Fix specifying API arguments from the CLI
#1505 opened Mar 27, 2024 by LoryPack Loading…
6 tasks done
[Evals] Add eval for Dhivehi diacritical marks
#1495 opened Mar 16, 2024 by aanaseer Loading…
11 of 12 tasks
Add **kwargs to OpenAIChatCompletionFn
#1494 opened Mar 15, 2024 by ezraporter Loading…
add a new eval:needle_in_a_matrix
#1475 opened Mar 11, 2024 by gordbegli Loading…
13 tasks done
Extending to Azure OpenAI implementation
#1470 opened Feb 23, 2024 by pkt1583 Loading…
Adding Indian Women Menstrual Health Chatbot Eval
#1430 opened Dec 11, 2023 by cranberrydeveloper Loading…
13 tasks done
Choose completion function for evaluation of modelgraded evals
#1418 opened Nov 17, 2023 by LoryPack Loading…
6 tasks done
Add Eval: name well known security weaknesses
#1392 opened Oct 28, 2023 by ourmony Loading…
1 task
Deepcopy in recorder
#1376 opened Oct 12, 2023 by johny-b Loading…
Add a new eval : chinese_literary_grace
#1375 opened Oct 7, 2023 by Conghui-Niu Loading…
12 of 13 tasks
Add gpt4facts Eval
#1363 opened Sep 25, 2023 by mmtmn Loading…
13 tasks done
Add Eval: Interpreting balance sheet absolute changes
#1336 opened Aug 16, 2023 by TensorTemplar Loading…
12 of 13 tasks
MMLU eval
#1324 opened Jul 29, 2023 by Livegan Loading…
13 tasks
Nits
#1308 opened Jul 7, 2023 by mrzu Draft
5 of 6 tasks
add eval against machiavellianistic attitudes
#1270 opened Jul 1, 2023 by Huge Loading…
Now I have the change in place, it seems wrong.
#1209 opened Jun 21, 2023 by CholoTook Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.