tl;dr: rss-proxy transforms almost any website into a feed, just with one click.
RSS is the glue code for an open and free internet. It helps you build your own network of sources. Unfortunately some websites don’t support it at all or not in a way to fill your need. This is where rss-proxy comes to play. It is a simple middleware that transforms HTML to a feed. The user will choose a suggested feed from a collection of feeds that exist in a website of interest.
I figured out that of any kind networks have a huge, huge value. Take states for example, they are basically just social networks of humans interacting within a certain context, oboying to the rules of the network. Same goes for news networks. When I learned that news and social media ar broken, I wanted to build my personal network of sources that I own and control entirely. Source by source. I did not want any “smartness” in it, like recommender systems, cause they don’t work for me. One reason why they don’t work is that you usually train one system per account. But what you usually want is a recommender system per context and those times when you wander to the awkward sides of youtube - as an example - I don’t want them at all to be used.
Despite the fact that rss generation is a niche, it is a rather crowded niche (see #appendix). There are a ton of solutions for this problem, which all chose the same approach: the user has to define the selectors for the particluar feed-entry fields. This is a tedious and complex activity, especially if the website changes frequently. I don’t like this approach.
rss-proxy tries to resolve feed mapping in an automatic manner, which leads to a more general feed, though finding and fixing a broken rss-proxy feed becomes trivial. A feed generator should solve the hardest problem in a good way: Getting updates of a website and point to the right link. Then the heavy lifting like full-feeds should be outsourced to other special services or the feed reader.
Birth of rss-proxy
When you want to go extreme with RSS, you will soon learn that there ist no feed for that fragment of a website you are interested in. I wanted a tool, that can turn a messy website into a standardized format that I can use to post-process. Hence, 8 month ago, I started developing rss-proxy, that should just analyze the HTML DOM structure and convert that potentially into a feed. I did not want to mess around with selectors (which is the general approach), cause I expect websites to change frequently, so I don’t want to invest time into that.
I had the following design goals in mind:
- Dead simple UX
- Works with a wide range of DOMs
- Cheap to define
- Error handling
I wanted rss-proxy to support a wide variety of DOMs, so I was observing the HN community and the RSS subreddit for people who want requested a feed of specific site. Here is my list:
v1 still has some weaknesses, see issues with v1. There are some sites, that just fail with rss-proxy. I want to support as many of those weird sites as possible, cause they might bring-up some flaws with the algorithm.
Here is an incomplete list of feed generators I found.