-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Support Another Languages - Portugue tested | Add Google Translation #1596
base: main
Are you sure you want to change the base?
Implemented Support Another Languages - Portugue tested | Add Google Translation #1596
Conversation
Another problem I had with the translation was that in none of my attempts did the hash load work. I noticed that it always changed between iterations. |
hey @joaorura thanks a lot for putting together this PR 🙂 I'm not sure about this strategy though and even if we do it, we have to refactor it as optional so as to not add any new dependencies but seems like an overkill. @shahules786 what do you think? |
Remembering that in addition to the Google Translate translation and the issue of translating all the strings, there is also a language limitation issue that I was also able to resolve. I don't know if it is this strategy that you mentioned, but I believe it is a potential help for me and essential since I would not be able to run the project without it. |
Although it is always an option to use larger LLMs, they are not necessarily available to us. After all, to use them, you need powerful hardware available with the model data available or a key with values available in some online API that needs to be configured and generally requires monetary values. So when running smaller models, it is interesting to have this resource. I honestly do not understand the context of exaggeration... The scenario of usefulness to the detriment of the context of smaller LLMs seems quite clear. |
@jjmachan |
@jjmachan |
Hey @joaorura Thanks for replying here. Can you help us understand the limiting factor in using ragas with Portuguese? We wanted to make sure that we prioritize and solve it. |
@shahules786 I brought up some issues mentioning them in these PR #1598 |
hey @joaorura just merged all of those PRs and will do a release in a couple of hours. Thank you so much for the help 🙌🏽 ❤️ btw would love to get on a call with you sometime and chat more if your interested 🙂 (I can share my cal link if you are - or do share yours). Would love to meet you |
Some small models, such as llama 3.1 or 3.2 3b or 2B. Mainly quantized, have difficulties with prompts in Portuguese, occasionally generating text in English, which is accented with part of the prompts in English.
In this way, translating all parts of the prompt is interesting.
In addition, for small models, translating can be a poorly executed task.
In this way, being able to have support from Google Translate is a quick and practical way to translate the prompts without worrying about getting a more robust model.
In addition, there was no support in English, so I added support for some more languages using the NLTK segmenter implementation.