-
-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove duplicate search entries #981
Remove duplicate search entries #981
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #981 +/- ##
==========================================
+ Coverage 37.23% 39.19% +1.96%
==========================================
Files 111 111
Lines 6346 6360 +14
==========================================
+ Hits 2363 2493 +130
+ Misses 3983 3867 -116 ☔ View full report in Codecov by Sentry. |
@BPerlakiH Thank you for your fix. I think one issue is how to handle the score of search results when there are duplicate entries. |
Thank you @HiroyasuNishiyama, that is a good point. I also changed the order of execution, so we do the title search first, and then the indexed search, this way if we drop duplicates we do that from the indexed ones, leaving the Please note that we do another sorting on these results here: kiwix-apple/SwiftUI/Model/SearchOperation/SearchOperation.swift Lines 48 to 57 in 9093134
|
@BPerlakiH And I realized that my fix had a misunderstanding. FYI, no exceptions occur during index searches in my test environment. |
I think it would be good to have @kelson42's opinion here, in regard to how this should compare with other readers. |
At this stage, I would recommend to stop investing more time on this PR. Like I suspected, the solution is not straight. We should think out of the box about this. IMO whatever we decide do, this is not the role of the reader to do this kind of "smart" computations. |
As a user, I like the current Apple Kiwix search interface, where I do not have to worry about the indexing method. For reference, I would like to leave my current thoughts on this PR. |
Ok, to sum up what's in this PR already:
For future discussion:
IMHO: this PR is already an improvement from what we had. I also do agree with @kelson42 that this is still not an ideal place to be, and a broader / more universal approach is needed for search handling, that would unify (cross platform) and simplify the reader side. |
@BPerlakiH @rgaudin @HiroyasuNishiyama I will merge this PR because it seems to fix the bug reported originaly. For the long standing problem/improvement, I have created a dedicated issue. |
2065328
to
990e51f
Compare
Fixes: #980
Superseeds #979
What we really want is unique search results, which is in fact unique URLs that the user can click on and navigate further. If 2 search results are linking to the same URL, that's a duplicate.
Solution
Store a set of
foundURLs
as we are processing the search results (both indexed and title search), and look for duplicates. Also make sure we adjust the URL endings to always contain the trailing "/" slash, but this is only for our comparison, the returned search result URLs should not be modified.After the fix on macOS: