RECITATION finishReason Causing Content Generation Stops in Google Models #21

gbaptista · 2024-06-23T11:29:11Z

Some Google models stop generating content due to finishReason = RECITATION.

According to the docs:

RECITATION: The token generation was stopped as the response was flagged for unauthorized citations.

The text was updated successfully, but these errors were encountered:

gbaptista · 2024-06-23T16:15:08Z

How to easily simulate it:

Give the first page of the first chapter of Harry Potter.

{
  "candidates":[
    {
      "finishReason":"RECITATION",
      "safetyRatings":[
        {
          "category":"HARM_CATEGORY_HATE_SPEECH",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.31806138,
          "severity":"HARM_SEVERITY_NEGLIGIBLE",
          "severityScore":0.13039611
        },
        {
          "category":"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.13764834,
          "severity":"HARM_SEVERITY_NEGLIGIBLE",
          "severityScore":0.0248928
        },
        {
          "category":"HARM_CATEGORY_HARASSMENT",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.44049937,
          "severity":"HARM_SEVERITY_NEGLIGIBLE",
          "severityScore":0.17050801
        },
        {
          "category":"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability":"NEGLIGIBLE",
          "probabilityScore":0.24653332,
          "severity":"HARM_SEVERITY_LOW",
          "severityScore":0.20914645
        }
      ],
      "citationMetadata":{
        "citations":[
          {
            "startIndex":268,
            "endIndex":417,
            "uri":"https://www.lisarivero.com/2011/06/24/plain-and-fancy-words/"
          },
          {
            "startIndex":302,
            "endIndex":581,
            "uri":"https://thefriendlyeditor.com/2012/03/09/rowling-hook-page-one/"
          }
        ]
      }
    }
  ],
  "usageMetadata":{
    "promptTokenCount":12,
    "candidatesTokenCount":97,
    "totalTokenCount":109
  }
}

Of course, these are probably expected results, with Google trying to avoid generating copyrighted content. The issue is that there are too many false positives, significantly halting generations for many prompts.

maayanorner · 2024-06-29T18:52:31Z

I have the same issue, I try to use Gemini for summarization. Naturally, summarization of copyrighted content would be flagged as "copyrighted content"; however, we have the explicit permission to use it.

naourass · 2024-09-29T03:12:17Z

I'm getting this error constantly from non-copyrighted material. I'm just trying to extract data/snippets from public law texts and attachments, and all the requested output is present in the provided input.

RyanMarten mentioned this issue Dec 16, 2024

Detect finish_reason that is not stop bespokelabsai/curator#261

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RECITATION finishReason Causing Content Generation Stops in Google Models #21

RECITATION finishReason Causing Content Generation Stops in Google Models #21

gbaptista commented Jun 23, 2024

gbaptista commented Jun 23, 2024

maayanorner commented Jun 29, 2024

naourass commented Sep 29, 2024

RECITATION finishReason Causing Content Generation Stops in Google Models #21

RECITATION finishReason Causing Content Generation Stops in Google Models #21

Comments

gbaptista commented Jun 23, 2024

gbaptista commented Jun 23, 2024

maayanorner commented Jun 29, 2024

naourass commented Sep 29, 2024