-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add quality checker #149
Comments
When journal name is only one word,its abbreviation is the same as the full name. |
Hi, I would like to tackle this issue with my group : ) |
@northword I think, the expected result is a Python tool residing in https://github.com/JabRef/abbrv.jabref.org/tree/main/scripts. It should print out issues and exit with failure code if issues are found. -- You can chose another programming language of you want. Example output of lychee, which has another purpose, but also outputs check results: (Source: https://github.com/JabRef/jabref/actions/runs/11361716475) |
Hey, when implementing the check logic for 'WARN: abbreviation is the same as the full text,' should we only give a warning if the journal's name has more than one word and the abbreviation is the same as its full name? If the journal name is just one word, as @northword mentioned, should we simply pass it? |
Yes. |
My current function that checks the validity of starting letters of abbreviations considers the below entries as invalid, because the starting letters of the abbreviations do not match well with the full names. Full: 'Polish Academy of Sciences', Abbrev: 'Acta Phys. Polon. A' However, these abbreviations seem to be legitimate for the corresponding full names, though not being obvious. Could you provide some idea how I should refine the criteria of invalidity? |
Maybe a hard coded list of exceptions? 😅 |
Not sure how many there are to be hardcoded : ( I might try using some similarity threshold to check them. That way abbreviations that are legitimate but are too different from the original full names would fail the check. Does that work? |
I haven't tried. Maybe test cases need to be generated. Maybe warnings can be output. Then an exception file generated by the user. Similar to .lycheeignore for the link checker lychee. Obe might aslo output a number stating the distance. For manual lists, this is helpful. For downloaded lists, reports could be made. I think, there are bugs in the lists. |
I needed to fix lists, because "wrong" lists were in. See #148
We should have a checker. Following are the tasks it should check:
ERROR: Wrong escape
ERROR: Wrong beginning letters
(This is #107)
ERROR: List contains non-UTF8 characters
This is #125.
WARN: Double entries
(This refs #77)
WARN: Same full form appearing twice
(This refs #77)
WARN: Same abbrevation appearing twice
(This refs #77)
WARN: abbreviation is the same as the full text
WARN: Management is abbreviated with outdated "Manage." instead of "Manag.
This is #78
The text was updated successfully, but these errors were encountered: