-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
option to decode percent-encoded URLs #377
Comments
Trurl tries to ensure that if it outputs a whole url, that URL is always valid. The percent encoded characters in your URL are not valid when decoded. an example of trurl decoding the url can be seen at the top of the man page in the normalization section:
If we try appending a valid percent encoded value to your url, for example
where you can see its showing the decoded A good solution for you may be to utilize the
|
Thanks for the explanation. In my case I want the invalid characters rather than their As mentioned in the initial post, I know about the JSON option, So maybe what I want is a |
If they would be shown "decoded", then the output would no longer be a URL since it contains illegal letters and then it can't be parsed properly. Like for example %20 is space and %0a is a newline but also other encoded characters are separators that you cannot encode back correctly. Like for example %2f (slash), %40 (@) or %3a (colon) if used in the "wrong" place. You can probably get (almost?) what you want with |
Yes, here I don't want a valid URL, I want data from it, just like when I pass |
If you want get data from it, you already can: as shown above. You just can't make it pretend it is still a URL when URL decoded. Since all the fields are accessible, there is no data you can't get this way. The only thing you don't get is that exact command syntax you ask for. |
In my monitoring of ArchiveBot I often have to deal with URLs that consist of percent-encoded junk.
I would like to be able to decode the junk with trurl, since it is a convenient tool for wrangling URLs on the command-line.
There doesn't appear to be a way to get trurl to decode the full URL as percent-encoded data. It only seems to do that when extracting query parameters, or for the JSON output.
Here is an example that I had to deal with recently:
I propose the solution to this situation would one of these two options:
Change the
--get
option to also URL decode the{url}
component (as currently implied by the documentation (The following component names are available (case sensitive): url,
...Components are shown URL decoded by default.
), and require--urlencode
to get the non-decoded version.Update the
--get
documentation to mention that the{url}
component is not URL decoded and then add a --urldecode option to get the URL decoded version of it. This could also be used without the--get
option as well.The text was updated successfully, but these errors were encountered: