-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: hyphen-proof search #1823
Comments
yeah this is an ongoing issue we're hoping to look at soon. We used to use a tokenizer that stripped out punctuation and hyphens, but then we had complaints that search queries containing hyphens or other symbols were'nt found so we swtiched to a different tokenizer. So we need to look at using multiple tozenizers, if that's possible. It's a sympton of the solr configuration rather than anything in seek. |
.... if you want to build your own solr container, I think you can just switch this and the line just below from WhitespaceTozenizer to StandardTokenizer. https://github.com/FAIRdom/solr-seek-docker/blob/master/conf/schema.xml#L64 |
Thanks for the information ! :) |
If I have a data file that is named "trials-wheat-2022" for example, it won't be found if I run the query "wheat" in the FAIRDOM search box.
It would be nice to have hyphen-proof search, to make this kind of hyphenated titles appear in the query results.
The text was updated successfully, but these errors were encountered: