Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using NBSP with colons in the punctuation spacing fixup #11237

Open
HarmfulBreeze opened this issue Mar 20, 2024 · 5 comments
Open

Using NBSP with colons in the punctuation spacing fixup #11237

HarmfulBreeze opened this issue Mar 20, 2024 · 5 comments
Labels
enhancement Adding or requesting a new feature. good first issue Opportunity for newcoming contributors. hacktoberfest This is suitable for Hacktoberfest. Don’t try to spam. help wanted Extra attention is needed.

Comments

@HarmfulBreeze
Copy link

HarmfulBreeze commented Mar 20, 2024

Describe the problem

Opening a feature request as requested in #9470

Some guides like the Lexique des règles typographiques en usage à l'Imprimerie nationale, often used as a reference in France, specify that there should be a regular non-breaking space before the colon character, but not for other "two-part" characters, such as semicolons, where a NNBSP should be used instead. Some organizations and media however prefer to use NNBSP with colons.

Are there plans to add a toggle (of sorts) to allow the use of NBSP with colons?

Thanks!

Describe the solution you would like

A toggle allowing to switch between NBSP and NNBSP with colons would be useful.

@nijel
Copy link
Member

nijel commented Mar 20, 2024

There could be an additional flag for this which would configure desired spacing. Something like punctuation-spacing:nbsp could work.

@nijel nijel added enhancement Adding or requesting a new feature. hacktoberfest This is suitable for Hacktoberfest. Don’t try to spam. help wanted Extra attention is needed. good first issue Opportunity for newcoming contributors. labels Mar 20, 2024
Copy link

This issue seems to be a good fit for newbie contributors. You are welcome to contribute to Weblate! Don't hesitate to ask any questions you would have while implementing this.

You can learn about how to get started in our contributors documentation.

@ashsgelb
Copy link

Can I claim this?

@Brian3015
Copy link

I'm thinking of working on this. How would you go about testing if you added this feature? Also, where would you add it?

@nijel
Copy link
Member

nijel commented Apr 2, 2024

Anybody can start working on an issue. The code belongs to the existing check and fixup implementation:

class PunctuationSpacing(AutoFix):
"""Ensures French and Breton use correct punctuation spacing."""
fix_id = "punctuation-spacing"
name = gettext_lazy("Punctuation spacing")
@staticmethod
def get_related_checks():
return [PunctuationSpacingCheck()]
def fix_single_target(self, target, source, unit):
if (
unit.translation.language.is_base(("fr", "br"))
and unit.translation.language.code != "fr_CA"
and "ignore-punctuation-spacing" not in unit.all_flags
):
# Fix existing
new_target = re.sub(FRENCH_PUNCTUATION_FIXUP_RE, "\u202f\\2", target)
# Do not add missing as that is likely to trigger issues with other content
# such as URLs or Markdown syntax.
return new_target, new_target != target
return target, False

class PunctuationSpacingCheck(TargetCheck):
check_id = "punctuation_spacing"
name = gettext_lazy("Punctuation spacing")
description = gettext_lazy(
"Missing non breakable space before double punctuation sign"
)
def check_single(self, source, target, unit) -> bool:
if (
not unit.translation.language.is_base(("fr", "br"))
or unit.translation.language.code == "fr_CA"
):
return False
# Remove XML/HTML entities to simplify parsing
target = strip_entities(target)
whitespace = {" ", "\u00a0", "\u202f", "\u2009"}
total = len(target)
for i, char in enumerate(target):
if char in FRENCH_PUNCTUATION:
if i == 0:
# Trigger if punctionation at beginning of the string
return True
if (
i + 1 < total
and unicodedata.category(target[i + 1])
not in FRENCH_PUNCTUATION_SPACING
):
# Ignore when not followed by space or open/close bracket
continue
prev_char = target[i - 1]
if prev_char not in whitespace and prev_char not in FRENCH_PUNCTUATION:
return True
return False
def get_fixup(self, unit):
return [
# First fix possibly wrong whitespace
(
FRENCH_PUNCTUATION_FIXUP_RE,
"\u202f$2",
"gu",
),
# Then add missing ones
(
FRENCH_PUNCTUATION_MISSING_RE,
"$1\u202f$2",
"gu",
),
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding or requesting a new feature. good first issue Opportunity for newcoming contributors. hacktoberfest This is suitable for Hacktoberfest. Don’t try to spam. help wanted Extra attention is needed.
Projects
None yet
Development

No branches or pull requests

4 participants