-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Generating a filtered full text RSS feed
Irfan Charania edited this page Jun 15, 2015
·
11 revisions
Problem: Your favourite website has a feed but you only wish to read certain types of posts, and their feed only provides summarized text
Solution: Use Huginn to create an RSS feed that has filtered full-text content. The workflow to filter posts and fetch full text is as follows:
- RssAgent - to fetch and parse existing RSS feed
- TriggerAgent - to filter feed items
- WebsiteAgent - to fetch full text for feed item
- DataOutputAgent - to output RSS
Examples based on Adventures of Business Cat
Name: Example RSS In
{
"expected_update_period_in_days": "14",
"clean": "false",
"url": "http://www.businesscat.happyjar.com/feed/"
}
Name: Example filter
Event sources: Example RSS In
Propagate immediately: Yes
{
"expected_receive_period_in_days": "14",
"keep_event": "true",
"rules": [
{
"type": "regex",
"value": ".*\\/comic\\/.*",
"path": "url"
}
]
}
Note: "keep_event": "true" helps pass on original parsed item elements to next agent
Name: Example page fetch
Event sources: Example filter
Propagate immediately: Yes
{
"expected_update_period_in_days": "14",
"url": "{{url}}",
"type": "html",
"mode": "merge",
"extract": {
"imgurl": {
"css": "#comic img",
"value": "@src"
}
}
}
Note: "mode": "merge" helps pass on original parsed item elements to next agent
Name: Example Rss out
Event sources: Example page fetch
Propagate immediately: Yes
{
"secrets": [
"examplerss"
],
"expected_receive_period_in_days": "14",
"template": {
"title": "Business Cat full comic feed",
"description": "This is a feed of recent Business Cat comics generated by Huginn",
"item": {
"title": "{{title}}",
"description": "<img src=\"{{imgurl}}\" />",
"link": "{{url}}",
"pubDate": "{{pubDate}}"
}
}
}