closedqa prompt is not adequate for gpt-4-0613 #1228

JasonGross · 2023-06-24T02:37:01Z

It seems that GPT-4 neglects to follow the instructions in the closedqa prompt much more than gpt-3.5-turbo. See, for example, #1200 (comment) where gpt-4 gives 9 invalid responses out of 47, while gpt-3.5-turbo does not give any invalid responses. Does this hold across the other evals in the repo?

douglasmonsky linked a pull request Jun 28, 2023 that will close this issue

[Resolves Issue #1228] Improve ModelGraded Evals Formatting for Increased GPT Compliance #1258

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

closedqa prompt is not adequate for gpt-4-0613 #1228

closedqa prompt is not adequate for gpt-4-0613 #1228

JasonGross commented Jun 24, 2023 •

edited by andrew-openai

closedqa prompt is not adequate for gpt-4-0613 #1228

closedqa prompt is not adequate for gpt-4-0613 #1228

Comments

JasonGross commented Jun 24, 2023 • edited by andrew-openai

JasonGross commented Jun 24, 2023 •

edited by andrew-openai