-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support biblatex extended name format #11975
Add support biblatex extended name format #11975
Conversation
…extended-name-format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code currently does not meet JabRef's code guidelines.
We use Checkstyle to identify issues.
The tool reviewdog already placed comments on GitHub to indicate the places. See the tab "Files" in you PR.
Please carefully follow the setup guide for the codestyle.
Afterwards, please run checkstyle locally and fix the issues.
You can check review dog's comments at the tab "Files changed" of your pull request.
Update: I removed the HTML comment markers |
I did not see any JavaFX component implementing a new JavaFX component for the entry editor: #4558 (comment). It is OK to focus on the logic only. It is OK, but then we need to create a follow-up issue. @u7302654 Please fix checkstyle issues. You seem to have missed Step 3: Set up JabRef's code style. Moreover, please add a screenshot to show that the sorting issue of the original poster is solved (#4558 (comment)). If it is not solved, it is also OK, because it improves the data model, but then we cannot close the issue at all. |
@@ -147,59 +140,37 @@ public AuthorList parse(@NonNull String listOfNames) { | |||
listOfNames = simpleNormalForm.authors; | |||
boolean andOthersPresent = simpleNormalForm.andOthersPresent; | |||
|
|||
// Handle case names in order lastname, firstname and separated by "," | |||
// E.g., Ali Babar, M., Dingsøyr, T., Lago, P., van der Vliet, H. | |||
final boolean authorsContainAND = listOfNames.toUpperCase(Locale.ENGLISH).contains(" AND "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was removed - therefore org.jabref.logic.formatter.bibtexfields.NormalizeNamesFormatterTest
is failing. Please fix. 😅
*/ | ||
private Optional<Author> parseExtendedNameFormat(String authorString) { | ||
Map<String, String> nameParts = new HashMap<>(); | ||
Matcher matcher = Pattern.compile("(\\w+)\\s*=\\s*([^,]+)(?:,\\s*|$)").matcher(authorString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please compile the regex only once. Compiling regular expressions is an expensive operation. Imagine a database of 10,000 entries, all of which use the extended name format, and the regex compilation takes 1 ms. We would have a 10 second delay for nothing.
Map<String, String> nameParts = new HashMap<>(); | ||
Matcher matcher = Pattern.compile("(\\w+)\\s*=\\s*([^,]+)(?:,\\s*|$)").matcher(authorString); | ||
while (matcher.find()) { | ||
nameParts.put(matcher.group(1).trim(), matcher.group(2).trim()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Named groups would make this code clearer
} else if (i <= authorList.length() - 5 && authorList.substring(i, i + 5).equals(" and ") && bracesLevel == 0) { | ||
authors.add(authorList.substring(start, i)); | ||
i += 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of magic values here, what do the 5 and 4 stand for here? If possible please delcare a constant that has a good name that explains the purpose of these values.
String givenNameAbbreviated = abbreviateGivenName(givenName); | ||
|
||
// create Author object | ||
return Optional.of(new Author(givenName, givenNameAbbreviated, namePrefix, familyName, nameSuffix)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't say anything that the code doesn't say already, it is useless, please remove it.
while (tokenStart < original.length() && Character.isWhitespace(original.charAt(tokenStart))) { | ||
tokenStart++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment above this block however would be helpful.
Closing this issue due to inactivity 💤 |
Description:
This pull request adds support for parsing BibLaTeX extended name format, which includes fields including family, given, prefix, suffix, and useprefix. This feature ensures correct parsing and formatting of author names with complex structures, including options to handle prefixes such as "van" and to manage cases where the prefix should be ignored (useprefix=false).
These changes solve the problem described in Issue #4558, where entries using the extended BibLaTeX format were not being parsed and sorted correctly, leading to incorrect name sorting behaviour.
Changes:
Closes #4558
Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if applicable)