Extend application routing

joomla · Nov 14, 2024 · a8a1e84 · a8a1e84
1 parent a5ab3c0
commit a8a1e84
Showing 1 changed file with 26 additions and 10 deletions.
diff --git a/docs/general-concepts/routing/index.md b/docs/general-concepts/routing/index.md
@@ -4,24 +4,40 @@ title: Routing
 
 Routing
 =======
+Routing is the process of mapping a URL to the right code in a system (and the other way around). Without `SEF URLS` (see below) Joomla is simply reading the query elements of a URL and calls the code based on that. The routing primarily is split in two parts, the application routing and the component routing, where the later is happening inside the former.
 
-## Introduction
+## Application routing
+All routing starts with the `Joomla\CMS\Router\Router` class, which provides a parse() and a build() method. An application can then provide specific behavior by extending that class and attaching callbacks via `Router::attachBuildRule()` and `Router::attachParseRule()`. You can see an example in the CMS in the constructor of the `Joomla\CMS\Router\SiteRouter` class. You can also attach additional code with plugins for example. Just make sure that you are adding the code before Joomla runs the parsing step. This can be done in the `onAfterInitialise` system event, as the CMS does for example in the languagefilter system plugin.
 
+The callbacks can then implement behavior for the router and for this have first the router object and then a `Joomla\CMS\Uri\Uri` object as parameter. All parsing and building is done on the `Uri` object, which contains the current URL to process. When parsing or building a URL, you are expected to read a part of the object, transform it as expected, store it back in the object and delete what you have processed. This is for the router to finally build the right URL or also discover if a URL couldn't be properly parsed, because for parsing it is expected that the router converts the whole path-part of the URL into query elements. If at the end of the parsing the path-part of the `Uri` object still contains characters, the router assumes that no router code could understand that part of the URL and then throws a 404 error. Likewise when building the URL from a query string to our SEF URL, we want to remove what we have processed, so that our URL doesn't contain unnecessary query parameters.
 
-In the context of a web application, routing involves analysing an incoming URL to determine which application constituent parts need to be involved to handle the request. Sometimes this is fairly straightforward, but in a Joomla site the support of Search Engine Friendly (SEF) URLs make this job particularly challenging. It is the responsibility of the Joomla Site Router to convert between SEF URLs and the internal URL format (which identifies those constituent parts).
+## Required query parameters and default values
+Every URL in Joomla contains at least an `&option=` element, which defines the component to be called for that page. If that `&option=` is not set when parsing a URL, it will default to the component of the home menu item for the frontend or `com_cpanel` in the backend. Likewise every URL in the frontend contains a `&Itemid=` element, which defines the menu item to be used for that page and it falls back onto the default menu item if not set.
 
-When people access web pages on your site you want to have URLs which naturally reflect the content that is being shown, like `mysite.org/articles/latest-news`.
+While components can do whatever they want, if you are following the best practice of MVC components in Joomla, every URL also contains a `&task=` element, which defines the controller to be used and the task of that controller, separated by a `.`. An example would be `&task=article.save`, which will call the `Article` controller and run the `save()` method in it. If no `&task=` is given this defaults to the default controller and the task `display`. In that case it then also reads the `&view=` element of the URL, pointing to the respective view of the component.
 
-Internally Joomla uses a URI format which indicates what resources are involved, like `mysite.org?option=com_content&view=article&id=5`.
+## Search engine friendly (SEF) URLs
+Joomla internally is controlled by the query parameters of a URL or in other terms: By an array of key-value-pairs of config-values. However these are not very human-friendly. To achieve "speaking" URLs, the site router in Joomla has a `SEF` mode, where these query elements are converted into speaking URLs. (Upon parsing, these URLs are of course again converted back into query elements/this array of config values.)
 
-It's the job of the Joomla router to convert between these 2 formats.
+This starts with discovering the right menu item. Joomla will split the path-part of the URL into segments delimited by `/`, starting from the base URL of the site and will then try to match these segments with the alias of the menu items. If it finds a match, it will compare the next segment with the aliases of that first menu items child items and repeat that process until it can't find a matching menu item anymore. That last menu item is the active menu item for this request and defines the component to be called. In the `Uri` object it will set the `&Itemid=` parameter for this.
 
-The router `parse` functionality converts external URLs (SEF) to internal URIs
+After this, the application instantiates the router of the component to be called and gives it the rest of the path of the URL and expects an array of key-value-pairs back.
 
-The router `build` functionality converts internal URIs to external SEF URLs.
+The other way around, when building a URL, the router will ask the component router to preprocess the URL (where the component router can for example set the correct Itemid) and then to build the path of the component, which is then prepended with the path of the menu item.
 
-This documentation section focuses on the Joomla Site Router, explaining how it works and how you can use it in your own applications.
+## Example of a SEF URL
+We are trying to build/transform the following URL into a SEF URL: `index.php?option=com_content&view=article&id=23:minas-tirith&catid=66&Itemid=42`
+The router will then first of all look up the menu item `42`, which would result in the path `/lotr-wiki/`, then the component router would look up the path of the category `66` and come to `/cities-in-middleearth` and last but not least add the alias of the article `42` at the end. The router might also add a suffix or do additional transformations, but your final URL might look like this: `/lort-wiki/cities-in-middleearth/minas-tirith`
 
-You need to have a good understanding of how the Site Router works if you are developing a Joomla component for which you want to use SEF URLs, or if you want to modify the router rules for Joomla core components. 
+| Example                                                                                                                                                                                                           |
+|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| index.php?<span style="color:blue">option=com_content</span>&view=article&<span style="color:green">id=23:minas-tirith</span>&<span style="color:red">catid=66</span>&<span style="color:yellow">Itemid=42</span> |
+| /<span style="color:yellow">lort-wiki</span>/<span style="color:red">cities-in-middleearth</span>/<span style="color:green">minas-tirith</span>                                                                   |
+| <span style="color:yellow">Menu-part</span>, <span style="color:red">Category-part (component)</span>, <span style="color:green">Article alias (component)</span>                                                 |
 
-You may find it helpful to view 2 videos, covering [parsing a URL](https://youtu.be/zqyjRuVaT8M) and [building a URL](https://youtu.be/d1WoRbXcvkw).
+## Parsing URLs and error handling in the frontend
+An important part of parsing a URL is deciding when a URL is actually not correct. Since the Joomla router does not know which code might recognize a part of a URL and thus does not know if a part has been properly parsed, it requires all code which recognized and parsed a part to remove what they were successfully able to identify from a SEF URL (=the path part of the URL). This has the end result that at the end of running all parsing code, the `Uri` object should only contain the array of key-value-pairs in the query part of the URL and the path should be completely empty. If the path is NOT empty, this means that the URL contains parts which were not recognized and the URL should not be accepted. Since Joomla 4.0 the router in that situation throws an exception with a 404. If the router would not recognize these unrecognized parts, it would lead to multiple URLs pointing to the same content.
+
+Since Joomla 5.3 the router has also been extended to allow for "softer" error handling. There are situations where the router can correctly parse a URL, but recognizes that the URL does not fit the expected format. This could be the case when URLs should contain a suffix (`.html`) and the URL doesn't have that or if the URL still contains the ID of the article as part of the segment (`/23-minas-tirith` from the above example), but the alias is actually not correct (for example `/23-minas-morgul` instead). In both cases the router can correctly parse the URL, because we defaulted to the suffix `.html` and we are loading the article with the ID 23, but the URL is not actually correct. The router can then set the `tainted` flag by calling `\Joomla\CMS\Router\Router::setTainted()` for the current parse process. Afterwards the following code can then check that flag by calling `\Joomla\CMS\Router\Router::isTainted()` and decide what to do. In a default installation the SEF system plugin will check that flag and if it is set, try to build a URL with the recognized parameters and redirect to that new URL with a 301 redirect. The redirect is specifically NOT done directly in the router, because otherwise it would mean that you can't parse another URL while running Joomla. It also allows for more than one fix at a time. If for example the correct URL would be `https://domain.com/lotr-wiki/cities-of-middleearth/minas-tirith.html` and the calling URL would be `http://www.domain.com/lotr-wiki/cities-of-middleearth/23-minas-morgul`, we don't want to have a redirect redirecting to `https` and then another one directly afterwards to the domain without `www` and then yet another one to the article segment without ID and the right alias, only to close this up with a last redirect to the right URL with the suffix at the end. Instead the flag allows for this to be marked as tainted and does one redirect at the end.
+
+You may find it helpful to view 2 videos, covering [parsing a URL](https://youtu.be/zqyjRuVaT8M) and [building a URL](https://youtu.be/d1WoRbXcvkw).