-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom tokenizer #17
Comments
Hey @morygonzalez, currently Tantiny does not support custom tokenizers. I had some ideas how to implement it, but it's a complex issue to tackle due to the fact that it requires extending behaviour in runtime which is not easy to do with Rust (let alone it's interaction with Ruby). However, it seems that |
@baygeldin Thank you! That's cool. I'm happy with your suggestion!! |
Okay, I'll see what I can do, but probably after I deal with aggregations (or you can make a PR yourself if you want). |
I see. I'll try to make a Pull Request though I'm quite new to Rust then it'll take some time. |
I want to use Tantiny with Japanese. There are several Tantivy tokenizers for Japanese language. I'm now considering lindera-tantivy which supports not only Japanese but also Chinese and Korean. Is it possible to use these custom tokenizers with Tantivy via Tantiny?
The text was updated successfully, but these errors were encountered: