New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Strengthening an existing plugin “Non-ASCII Equivalents” #387

Closed

Echelon666 wants to merge 3 commits into metabrainz:2.0 from Echelon666:2.0

Echelon666 commented Nov 6, 2024

I added 41 new characters to the existing “Non-ASCII Equivalents” plugin:

18 Polish diacritics
23 "My others" special characters that were junk in the ID3 tag of my MP3 collection.

thread on forum:

https://community.metabrainz.org/t/strengthening-an-existing-plugin-non-ascii-equivalents/722822

Echelon666 and others added 3 commits

November 6, 2024 22:46


          Strengthening an existing plugin “Non-ASCII Equivalents”

5133b22

I added 41 new characters to the existing “Non-ASCII Equivalents” plugin:

18 Polish diacritics
23 "My others" special characters that were junk in the ID3 tag of my MP3 collection.

Co-Authored-By: Anderson Mesquita <[email protected]>


          Update non_ascii_equivalents_enhanced.py

2731b39

Co-Authored-By: Anderson Mesquita <[email protected]>


          Update non_ascii_equivalents_enhanced.py

5524f95

Co-Authored-By: Anderson Mesquita <[email protected]>

Author

Echelon666 commented Nov 7, 2024

I don't know how to fix the Codacy warning. I don't want to break the code.

phw requested changes

View reviewed changes

Member

phw left a comment •

edited

Loading

Thanks for your changes. In general the additions make sense, but I have two concerns here:

I think this plugin would be more useful as a separate plugin if it would provide a scripting function instead of just doing the replacement in 3 arbitrary tags. See my comment below.
Otherwise the extended replacements should become part of the existing plugin. I don't see why we need an essentially identical plugin in the plugin list, this just causes confusion.
The simple accented characters that just get replaced with their unaccented base character should be removed from the static replacements. Instead Picard's built-in unaccent function should be run on the string. This will take care of all the accented cases here and all those that are probably missing.

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py

		return char


		def ascii(word):

Member

phw Nov 7, 2024

ascii is a built-in Python function, and the custom function is hiding that. Hence the warning. I'd recommend renaming this function to something else (replace_ascii_equivalents?)

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py

+                  "・": "-",
+                  "☆": "-",
+                  "★": "-",
+                  "/": ",",

Member

phw Nov 7, 2024

This and the four chars below don't seem to belong there. Those are all valid ASCII characters and I don't see why those should be unwanted in tags.

For Windows filenames they are not allowed, but those characters get replaced anyway by Picard in filenames and there are existing settings for those.

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py

+                  "album",
+                  "artist",
+                  "title",
+              ]

Member

phw Nov 7, 2024

I think this limitation to those three tags is one of the weaknesses of the current plugin. This seems very arbitrary and limiting.

What I think would be much more useful is if the plugin would register a scripting function, see https://picard-docs.musicbrainz.org/en/appendices/plugins_api.html#tagger-script-functions . If there was a scripting function $ascii(...) or such (maybe someone has a better name) then users would be flexible to replace characters in whatever tag they want:

$set(composer,$ascii(%composer%))

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py

+                  "→": "-",
+                  "・": "-",
+                  "☆": "-",
+                  "★": "-",

Member

phw Nov 7, 2024

Maybe better replace with ASCII star *

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py

+                  "í": "i",  "ì": "i",  "ï": "i",  "î": "i",
+                  "ó": "o",  "ò": "o",  "ö": "o",  "ô": "o",
+                  "ú": "u",  "ù": "u",  "ü": "u",  "û": "u",
+                  "ý": "y",  "ỳ": "y",  "ÿ": "y",  "ŷ": "y",

Member

phw Nov 7, 2024

All of the above, some of the ones in "Misc Letters" and most of the new "Polish letters" you added can be handled by Picard's built-in function picard.util.textencoding.unaccent.

from picard.util.textencoding import unaccent

print(unaccent("ÄŽ"))  # this will print "AZ"
```

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py

+                  "Ç": "C",
+                  "ç": "c",
+                  "Ñ": "N",
+                  "ñ": "n",

Member

phw Nov 7, 2024

The above four will be handled by unaccent

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py

+                  "Ł": "L",
+                  "ł": "l",
+                  "Ń": "N",
+                  "ń": "n",

Member

phw Nov 7, 2024

Except for "Ł" and "ł" all the above will be handled by unaccent

plugins/non_ascii_equivalents_enhanced/non_ascii_equivalents_enhanced.py



		def ascii(word):
		return "".join(sanitize(char) for char in word)

Member

phw Nov 7, 2024

I'd suggest getting rid of the diacritic characters in the replacement tables as noted above and use Picard's unaccent function instead as I described above. Then do this here:

    word = "".join(sanitize(char) for char in word)
    return unaccent(word)

phw mentioned this pull request

Added polish diacritics in non_ascii_equivalents.py #386

Merged

Member

phw commented Nov 7, 2024

@Echelon666 @finem4n So we currently have two similar PRs, #386 and #387. I think you guys should see what changes we make.

If Echelon666 is fine reworking this plugin to provide a scripting function I think this warrants this as a separate plugin. The existing non_ascii_equivalents plugin can still be extended with additional tags as in #386 and reworked to use unaccent function.

Contributor

Sophist-UK commented Nov 7, 2024 •

edited

Loading

I am entirely unclear why we are creating an entirely new "enhanced" plugin here rather than simply enhancing the existing one.

This is like me saying I want to tweak something in Picard, and I do so by publishing Picard Enhanced which does exactly what Picard does but with my additional tweak.

This is a route to madness - an exponentially expanding ecosystem of almost identical functionality, but each slightly different from the other. If I wanted to enhance this plugin further, am I supposed to publish a "Non-ASCII Equivalents Enhanced Extended" version - and what happens when (sooner) we run out of synonyms for "Enhanced" OR (later) it literally takes more characters for the name than there are atoms in the Universe?

Obviously you need to be wary of introducing backwards incompatibilities that would significantly change the outcome for someone upgrading in ways they wouldn't want. I do NOT think that is the case here, but even if it is was backwardly incompatible, I would expect to see minor incompatibilities handled with an option not by creating an entirely different but extremely similar competing plugin.

P.S. It also makes reviewing the code more difficult because you don't get a DIFF showing what has changed from the original.

P.P.S. I did suggest to @Echelon666 in the forums BEFORE he created this PR that this should be an update to the existing plugin, but that advice was apparently ignored. See also #384 where I took the time to provide an example of his proposed plugin with a better coding style, but @Echelon666 also ignored advice and review comments. TBH I am somewhat sadly forming the impression that it is complete waste of time reviewing PRs from him, because review comments and advice are never accepted and in the end the PR simply languishes open but unmergeable.

@phw said:

If Echelon666 is fine reworking this plugin to provide a scripting function I think this warrants this as a separate plugin.

I disagree. Adding a scripting function is a non-breaking change.

Member

phw commented Nov 7, 2024

I disagree. Adding a scripting function is a non-breaking change.

My problem with the existing plugin is that it does all this transformation on existing kind of arbitrary existing tags. Adding a function is non-breaking, but removing the automatic conversion of tags would be. So when installing the plugin one would need to accept this automatic conversion.

On the other hand we could also add such a function to Picard proper.

Overall #386 is the better PR currently, though.

Contributor

Sophist-UK commented Nov 7, 2024 •

edited

Loading

No - whilst adding a hardcoded expanded list of tags to apply this to would indeed be breaking, instead providing a UI options page where the default set of tags is the existing set would be non-breaking. IMO we really, really need to avoid proliferation of highly similar plugins and instead have one plugin with options.

Member

phw commented Nov 7, 2024

IMO we really, really need to avoid proliferation of highly similar plugins and instead have one plugin with options.

Yes, I agree with that. The non_ascii_equivalents as it exists currently is for that reason something I definitely would not want to port to the new plugin system as it is.

Contributor

Sophist-UK commented Nov 7, 2024

@Echelon666 When you make a commit, please provide an explanatory commit comments so that we can immediately see what your commit is intended to do.

Author

Echelon666 commented Nov 7, 2024

OK. Thanks.

Author

Echelon666 commented Nov 7, 2024

I removed Trailing whitespace and repeated Ó.

I can't help you more because I don't know Python and Github very well.

You have to finish it yourself.

finem4n added a commit to finem4n/picard-plugins that referenced this pull request


          Added 'other' characters from metabrainz#387

d65e0f6

Member

phw commented Nov 13, 2024

@Echelon666 I think the earlier PR #386, which now has also been updated, better addresses improving the plugin. With these changes most of what you wanted to have improved, specifically with handling Polish characters, will be solved in the existing plugin.

phw added the do not merge yet label

Author

Echelon666 commented Nov 13, 2024

But what do you suggest? That I remove my (this) PR?

Let it be. ;) Just give me a sign.

I will have my plugin for private use anyway, because in addition to Polish characters it has to remove funny characters.

Contributor

Sophist-UK commented Nov 13, 2024 •

edited

Loading

Philipp is suggesting that you close this PR because:
a. Your ideas were incorporated into another PR which is going to get merged; and
b. Because you said "I can't help you more because I don't know Python and Github very well. You have to finish it yourself." and no one is prepared to finish it for you because of your attitude.

I would imagine that he would prefer that you close it yourself unmerged, rather than for him to do it unilaterally.

And don't forget that I predicted this a few days ago in a comment elsewhere.

Author

Echelon666 commented Nov 13, 2024

@Sophist-UK can you shut up?

Contributor

Sophist-UK commented Nov 13, 2024 •

edited

Loading

Let it be. ;) Just give me a sign.

You quite literally asked for someone to tell you what to do. So I did - admittedly pretty bluntly.

And then you blame ME and tell me to shut up?

Given our shared history of me being blunt with you on several occasions now, did you really think that telling me to "shut up" would work rather than just prompt a further blunt response?

FFS, grow up, act more maturely and take responsibility for the consequences of your own words and attitude.

Author

Echelon666 commented Nov 13, 2024

Go away, you loser.

phw pushed a commit that referenced this pull request


          Added 'other' characters from #387

9ef9788

Member

phw commented Nov 14, 2024

Yes, I'll close this. There have been two very identical PRs, and the other one provided the significant changes to the existing plugin. Coding changes can take time, and oftentimes one needs to go one step back to move forward. Also a plugin like this always can be changed and extended for personal use easily if wanted.

@Echelon666 It's one thing to argue, but insulting others here is not acceptable. @Sophist-UK, who has been around for a long time and knows Picard and the plugins well, has been trying to help you despite all difficulties and disagreements.

I'll close this PR now, but I expect all future discussions here and on the forums to be civilized.

phw closed this

metabrainz locked as too heated and limited conversation to collaborators

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

do not merge yet