Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for MediaWiki wikis with non-standard URLs #10

Open
stjohann opened this issue Dec 7, 2019 · 4 comments
Open

Improve support for MediaWiki wikis with non-standard URLs #10

stjohann opened this issue Dec 7, 2019 · 4 comments

Comments

@stjohann
Copy link
Owner

stjohann commented Dec 7, 2019

Since the addition of support for different servers, it uses /wiki/$1 pattern to detect whether something is a wiki or not. Judging by interwiki map on Meta-Wiki, we can see that this is not enough to determine whether something is a MediaWiki wiki or not: they can also use simple /$1 or something convoluted like /index.php?title=$1 or /index.php/$1 and still be valid wikis.

This poses two problems for the current code:

  1. Easier one: support more wiki URL patterns in linking bot. This can be done by including checks for more URL patterns and fetching APIs of those wikis for their interwiki chains. I should come up with a good way to know (and even remember) wiki URLs somewhere, because it might be silly to ask, say, Google for /api.php a hundred times.
  2. Harder one: update the current code to use /api.php at the end of the string as a way to validate wiki URLs rather than /wiki/$1. That way, the bot will ask the API and get and remember the article path from there. I didn’t hear any requests before asking about this problem, but it will be a good thing to do. All the old values with /wiki/$1 will need to be deprecated and updated in the configs.

The removal of deprecation of old URLs will introduce a new major version (v.N.0.0) of the bot.

@stjohann stjohann pinned this issue Dec 7, 2019
@jhsoby
Copy link

jhsoby commented Mar 4, 2021

Hi, I just discovered the existence of this bot! I'm the one who made @wikilinksbot on Telegram. You may get some inspiration by how I solved this very problem in this commit (see lines 487–521).

@stjohann
Copy link
Owner Author

stjohann commented Mar 8, 2021

Hey, nice work! Glad to learn of the Telegram bot and will no doubt look into its code in the future (definitely in regards to magic words etc.).

Your approach is interesting, but I will probably try to find something less complicated. People in mwclient/mwclient#34 suggest parsing HTML of modern wikis for <link rel="EditURI" type="application/rsd+xml" href="//www.mediawiki.org/w/api.php?action=rsd" /> for instance, that seems a bit better if you have to choose between relatively two hacky things.

For the configuring moderator, asking for the API path (maybe even linking to Special:Version on how to get it) is better in my case since the default siteinfo request (which the bot needs by default) would already contain the article path.

@jhsoby
Copy link

jhsoby commented Mar 9, 2021

Ah, that's an even better solution than what I used, agreed! I think I will implement this too.

jhsoby added a commit to jhsoby/telegram-wikilinksbot that referenced this issue Mar 20, 2021
Improved the way the API URLs are found when setting the wiki URL.
Based on suggestion in stjohann/DiscordWikiBot#10.
@stjohann
Copy link
Owner Author

b77c40a adds the foundation for the changes required: URLs ending with /api.php can now be set in config.json and are treated as a valid URL. This change was made spontaneously after two separate requests in Discord, so ideally I should re-visit this and see how to proceed further, as well as test all the potentially problematic input more thoroughly. (This comment is mostly to document that this is now possible.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants