Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure new_url is absolute in Redirects #391

Conversation

JakubMastalerz
Copy link
Contributor

resolves #384

This PR changes new_url handling in RedirectType to ensure that the returned value is always an absolute URL.

In resolve_new_url, instead of relying entirely on the link property of the Redirect model, a check was added to ensure the URL being passed is absolute when the redirect does not point to a specific Page object. Relative URLs will be turned into absolute URLs.

grapple/types/redirects.py Outdated Show resolved Hide resolved
Comment on lines +42 to +58
if self.redirect_page:
return self.link # Handled by the `Redirect` model

elif self.redirect_link:
parsed_url = urlparse(self.redirect_link)

if not parsed_url.scheme: # url without scheme is not absolute
return (
self.site.root_url.rstrip("/")
+ "/"
+ self.redirect_link.lstrip("/")
)
else:
return self.redirect_link

return self.link
else:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this change will have the effect of causing newUrl to always be absolute. Redirect.link uses Page.url in the case of redirect_page existing which will return a relative URL on single-site instances, or an absolute URL on multi-site instances. The reason your test is passing is that there are multiple sites. If we want to always return an absolute URL we should use Page.full_url.

Given the behaviour of Page.url (which seems like it shouldn't be problematic) I'm scratching my head a little as to why we're doing this. Unfortunately our ticket and my notes are a little vague as to the motivator. Is it worth making a change just for consistency of the format of the returned URL? Any thoughts @zerolab ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just got time to look at this properly.

Redirect.redirect_link is an URLField and will always have a URL scheme and netloc (or at least it should), so this change doesn't improve the behaviour whatsoever.

Looking at #384, the thought process was that newUrl should always be a full URL to reduce ambiguity in a multi-site setup. As far as I can unpick the results from #384 the relative URL comes from the fact that the redirect applies to all sites.

Now, if we change to return full URLs, we should document it properly in the changelog and release notes. I am not opposed to always returning the full URL, but definitely want to have proper test coverage for redirect that apply to all sites

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saying that, I do wonder if we're adding maintenance burden as we'll need to replicate https://github.com/wagtail/wagtail/blob/a09bba67cd58f519f3ae5bff32575e7ce9244031/wagtail/contrib/redirects/models.py#L68-L81

Copy link
Contributor

@jams2 jams2 May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, newUrl being relative will only occur in single site setups, as we call into Page.url which returns relative URLs for single site, or an absolute URL if there are multiple sites. I'm having a hard time coming up with ways that this would be problematic. I think the original issue is erroneous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Redirects don't always return absolute URL
3 participants