Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Structured Extract with French Language #870

Open
KuriaMaingi opened this issue Nov 5, 2024 · 1 comment
Open

[Bug] Structured Extract with French Language #870

KuriaMaingi opened this issue Nov 5, 2024 · 1 comment
Assignees
Labels
blocked bug Something isn't working question Further information is requested

Comments

@KuriaMaingi
Copy link

Describe the Bug
Attempting to use the structured extract of a French language site

class ExtractSchema(BaseModel):
image: str
product_title: str
product_description: str
price: float
age: str
ean_or_productcode: str
brand: str
format: str
number_of_players: str
length_or_width: str
height: str
depth: str
playing_time: str
mechanisms: str
price_currency: str

1st Link Fails:
Link 1
Results: 'extract': 'ogLocaleAlternate:|google:notranslate'

2nd Link Successful:

Link 2
Results: 'extract': "ogTitle:Acheter Nexcube 3x3 Classic - MoYu - Casse-têtes|ogDescription:'Avec Nexcube 3x3 Classic, faites tourner les cases de ce Cube jusqu''à ce que chaque côté du cube ait une couleur uniforme. Un casse-tête ergonomique conçu pour la compétition.'|ogImage:https://cdn1.philibertnet.com/517165-large_default/nexcube-3x3-classic.jpg|ogLocaleAlternate:|ogSiteName:Philibert|og:title:Acheter Nexcube 3x3 Classic - MoYu - Casse-têtes|og:site_name:Philibert|og:description:'Avec Nexcube 3x3 Classic, faites tourner les cases de ce Cube jusqu''à ce que chaque côté du cube ait une couleur uniforme. Un casse-tête ergonomique conçu pour la compétition.'|og:type:product|og:image:https://cdn1.philibertnet.com/517165-large_default/nexcube-3x3-classic.jpg|google-site-verification:eOyJ7NyAZOoDK45PX0O9qnGLhUd3ebBikLzZOD7D-Ic"},

To Reproduce
Steps to reproduce the issue:
firecrawl_client.scrape_url( url, params={'formats': ['extract'], 'extract': {'schema':extract_schema}, 'location': {'country': 'FR'} }

Expected Behavior
I would expect the LLM to be able to translate between the two languages given the location param.

If the issue isn't the language but rather the site vs. the schema, would be good to know as well

Environment (please complete the following information):

  • OS: [Windows]
  • Firecrawl Version: [e.g. 1.4.0]
@KuriaMaingi KuriaMaingi added the bug Something isn't working label Nov 5, 2024
@nickscamara
Copy link
Member

Hey @KuriaMaingi, we are taking a look. Are you self hosting or using the cloud service?

@nickscamara nickscamara added the question Further information is requested label Dec 20, 2024
@linear linear bot added the blocked label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants