Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search index add an option to define a hyphen as word separator #44593

Open
onki69 opened this issue Dec 9, 2024 · 6 comments
Open

Search index add an option to define a hyphen as word separator #44593

onki69 opened this issue Dec 9, 2024 · 6 comments

Comments

@onki69
Copy link

onki69 commented Dec 9, 2024

Is your feature request related to a problem? Please describe.

On my current 5.2.2. installation I was using the old search feature. After changing to the new smartsearch (index based) search some content is no longer found with the "exact word" setting. In my text there is "bluetooth-audio" mentioned on some articles. When using "bluetooth" as search item these articles cannot be found. Only when changing to "word contain item" (sorry I am using the German Joomla translation) the articles are found. But with this setting the search will give too many unwanted search results so I am more happy with the more strict settings.

Describe the solution you'd like

To be able to find words that are separated with a hyphen it would be useful to have an option to treat a hyphen as a blank so that e.g. "bluetooth-audio" will be indexed as "bluetooth" and "audio" so articles with this word will be found when either "bluetooth" or "audio" are used as search word.

Additional context

@chmst
Copy link
Contributor

chmst commented Dec 13, 2024

This is a language specific thing (German). I do not think that we should have an extra option.
@Hackwar

@brianteeman
Copy link
Contributor

Dont know what it does but does this option help?

Image

@Hackwar
Copy link
Member

Hackwar commented Dec 13, 2024

I've been considering adding this library https://github.com/nitotm/efficient-language-detector to detect the right language and then to replace that option with just a switch to use stemmer or not.

@Hackwar
Copy link
Member

Hackwar commented Dec 13, 2024

Scrap that. I just looked at the library a little bit closer and it is way to resource intensive to add this. Bummer.

@mrownicki
Copy link

I add some comment, because i using Smart Search very often.
Generally smart search is not refined.
Basic settings are wrong.
And too complicated.
Because the priority should always be the TITLE of the article, then the content.

What is the importance and there are cases that it has to look for the expression at the beginning. An expression is an expression. It doesn't matter if it is one word or 5 words.
It should be as simple as possible and meet 90% of the conditions for an ordinary person to start with.
Then switch to advanced mode and I can set it the way I want.

At this point, each time you have to switch to a higher priority to make the Title.
It doesn't matter whether it is a page about books, events, or a simple article database.
I always have to switch it up to get better results.

The most important keyword is almost always in the title. It doesn't matter if it's search or SEO.

Simple example:

Filter numeric is off
This is title article Bezdzień [2024]
Examaple:
I looking:
Bezdzień [2024]
Nothing results.
Bezdzień 2024
Nothing results.
"Bezdzień [2024]"
Nothing results."
"Bezdzień 2024"
Nothing results.

Maybe problem is in Polish lang. "ń" nope.
This same situation is to term Pulsar 2849 or "In land we trust" - Armenia

Change setting match word it doesn't change things.

On the administration smart search found no problem. Article is indexed.

Regards.

@brianteeman
Copy link
Contributor

The default index priority is

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants