New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a documented way to customize slugs generated by SlugInput
/ urlify
#11916
Comments
Thanks for the report @drimacus182 - while the previous approach to override the global function I think we should try to consider a few ways to resolve this. 1. Potentially incorrect transliterationYou will have to help us out here, could you please advise the exact characters , in text (not just a screenshot), that you think should be updated to be transliterated differently? All of this transliteration is based on the Django We have now pulled this into a more manageable config file Maybe we should just try to fix this in our code (plus we could suggest a similar fix in Django's code), so that the characters are handled better. # current characters configured
"UKRAINIAN_MAP": [
["Є", "Ye"],
["І", "I"],
["Ї", "Yi"],
["Ґ", "G"],
["є", "ye"],
["і", "i"],
["ї", "yi"],
["ґ", "g"]
] 2. Another unofficial work aroundIt should be possible to add an event listener to the slug input against the event This is not as clean as just mutating the global but should be enough to get you started with specific transliteration requirements. This would still be an unofficial work around though, and could break in any future release. # .../static/js/admin.js
document.addEventListener(
'w-sync:apply',
(event) => {
// return if the event is not from the slug field
if (event.target.name !== 'slug') return;
// return if the event has already been processed
if (event.detail.processed) return;
console.log('w-sync:apply', event);
// stop the current event from propagating
event.preventDefault();
event.stopPropagation();
// get the value to apply and mutate to suit our needs, dispatch a new event
const value = event.detail.value || ''; // will be the original title's value
event.target.dispatchEvent(
new CustomEvent('w-sync:apply', {
detail: {
// change any values from the title field before it gets to the slugify controller
value: value.replaceAll('ґ', 'g'),
processed: true,
},
}),
{ bubbles: true, cancelable: true },
);
},
{ capture: true }, // ensure this listener captures the event first
); # ... /wagtail_hooks.py
from django.utils.safestring import mark_safe
from django.templatetags.static import static
from wagtail import hooks
@hooks.register('insert_global_admin_js')
def global_admin_js():
src = static('js/admin.js')
return mark_safe(f'<script src="{src}"></script>') 3. Add ability to intercept the urlify/slugify more easily with eventsWe should probably dispatch a change event when the slugify/urlify triggers, this could be intercepted easily and any further mutations of the field's value can be made. Additionally we could add custom events for when urlify/slugify are about to happen and allow this value to be changed similar to how we do for the image/document title field. https://docs.wagtail.org/en/latest/advanced_topics/images/title_generation_on_upload.html#images-title-generation-on-upload I would rather us try to get the transliteration right first though, but I am not sure what wider input is needed here from the community. Generally transliteration is a error-prone process. We currently run into a similar problem with this for the form builder usage of |
I have also posted in the For now @drimacus182 - can you test the work around above and see if that gets you to a point where you can use 6.1 in your production code. Any other suggestions welcome. |
Hi @lb-, thanks for looking into this. I'll try to add my thoughts: Approach 1I understand the desire to make slugs work out of the box, but I see a problem with this universal approach. The existing Django's urlify.js and wagtail's urlify.config.json you've mentioned are just mapping letters one by one, without any knowledge of what language the original string is in. E.g. for a Cyrillic character, it starts with Example from urlify.config.json: "RUSSIAN_MAP": [
//...
["г", "g"], // should be "h" for Ukrainian
["и", "i"], // should be "y" for Ukrainian
// ... Apart from phonetics, there are different transliteration standards possible for different contexts and languages. So there is no simple "one-size-fits-all" solution for every use case. Such a solution would require knowledge of what language the original string is in and will introduce lots of complexity. Just in case, leaving a link for the official transliteration standard for Ukrainian Approach 2I'm generally fine with the 2nd solution, and thank you for the complete code sample. I'd try to use this with my deployment, but can't promise to do this quickly. It feels a bit hacky for me though, especially considering it is unofficial and can break in future releases. Approach 3
It sounds most satisfying to me since my initial need was to have an ability to modify the internals of To summarise, I think it's straightforward for a developer to write a transliteration function that will fit all their needs. On the contrary, it's almost impossible to create a single transliteration logic that will make everyone happy. So the solution should be any of those that allow a developer to define their custom logic. |
@drimacus182 here's a proposed client-side API for overriding this behaviour. document.addEventListener('w-clean:urlify', event => {
const {
apply /* A function you can use, once, to apply a custom value */,
currentValue /* Current value (e.g. 'рецепт чізкейку') */,
newValue /* Default transliteration (e.g. 'hecept czizkejku' ) */
} = event.detail;
// If you want to stop the new value from being used and make your own...
event.preventDefault();
apply(myCustomFunction(currentValue) /* e.g. 'recept czizkejku' */);
}); A similar event would be available for Additionally, we would add change events to the slug field, so you can listen for a change after it's been applied and do something else then. This approach leverages the DOM events API, avoids globals and gives us room to enhance later with additional data being provided to the event. You'll also have access to the input with However, this would not easily allow any Async usage, as the preventDefault call needs to be used. It would be possible to still use this method to do a fetch call and update the field after the response comes back though, just may be a bit more involved. Finally, this involves us updating the |
Hey, @lb- . If possible, could you please also consider Azerbaijani letters? This would also cover many Turkic based languages at once (including Turkish). Azerbaijani has more letters than others in average.
|
Perhaps instead of |
I like that idea @laymonage - we could still allow the JS override in the future but I think having a non-JS way to do simpler customisations would be a better approach initially. Could you flag this for the core team discussion, see if there are any other ideas/inputs. |
SlugInput
/ urlify
I've made a suggestion on Slack, and cross-post it here to keep the ref:
See Slack for the full discussion: https://wagtailcms.slack.com/archives/CTKN6UXKN/p1714721188163579?thread_ts=1714686970.068729&cid=CTKN6UXKN |
The use case seems common enough to me to warrant a documented approach indeed. If we go the custom code route, I would prefer a custom urlify config file over custom JS in the admin for simplicity. Not convinced that a universal approach is not feasible. If the mapping of characters to the incorrect language raised by drimacus182 is the only issue, forgive my naïveté, but don't we already have information about the original locale available? Example: if the editor of a site configured for Ukrainian created a page, the locale of that page would be Ukrainian. We can use this information to consult the Ukrainian character map first. If it's a Russian site we would use the Russian character map first. Wouldn't that solve the issue? No need for customizing, it would work out of the box. Or are there other issues I'm not aware of? Would love to hear your input @drimacus182. I don't think making the slugify feature a little smarter would introduce that much extra complexity to support. We'd restructure the urlify config to support looking up maps by language code, pull the current language code from the page and use that as 'preferred locale' when transliterating. All easy things to say from the comfort of my chair, I'm not familiar with We could decide we don't want the added complexity and leave it up to developers to implement their own slugify mechanism, but that seems a little unfair to me. I believe it should work correctly out of the box if at all feasible. |
@Stormheg I agree that changing the mapping to use the locale's language code as the key is a better approach. It may result in a bigger character map especially if there are multiple languages that could utilise the same mapping, though, so if we care about the dictionary size then perhaps we could also add support for aliases to let multiple languages use the same mapping. However, I think another issue is that Wagtail's character map may be incomplete. It's true that anyone can submit a pull request to add support for more characters/languages, but that means they'd have to wait until the next release to be able to use it. If there was a way to customise the mapping, developers can do it in the mean time until they upgrade for the next release. |
We discussed this during the core team meeting Summarizing;
|
I agree with @Stormheg. Delivering a solution that "just works" for Ukrainian transliteration solves the immediate problem - if any other situations arise in future where a custom function is necessary, we can cross that bridge when we come to it. In the meantime, the workaround that @lb- has provided should ensure that this is not a blocker for anyone upgrading to 6.1, so I'll bump the milestone to 6.2. |
I'm generally fine with this solution and will use it in my deployment if Wagtail team is comfortable creating and maintaining character maps for different languages. What I'm a bit concerned with is the lack of fine control for a developer to make customisations. I can imagine scenarios when you may want slightly different transliteration logic for Pages and Snippets (i.e. blog posts and authors). And some other urlify customisations like removing stop words. These are a bit hypothetical though, and can be addressed using the workaround @lb- has provided above. |
I've just tried new 6.1 and as I can see
window.URLify
is deprecated and is not used bySlugInput
internally.The problem is that current SlugInput urlify function is although universal but not entirely correct for every language. For instance, for Ukrainian it produces incorrect results according to transliteration rules:
Prior to 6.1 I was solving this using a tailored
URLify
function injected into admin. Now it lives inside a stimulus controller on a client js side, which is not straightforward to customize.It would be great to have a documentation describing best practices on how this could be achieved.
The text was updated successfully, but these errors were encountered: