It takes an array with the entries as input, not a web page. But I guess the HTML parsing should take no more than another few lines? For HTML parsing, I have good experiences with the lxml module which is in the Debian repos. It is fast and works pretty well.
Looks like you're hosting this on fly.io - PAYG model. You could probably host this for free on Cloudflare Workers; 100k requests/day on the free tier; static content (the homepage) is free & unlimited.
Edit: The catch is the 10ms CPU cap per request - you'd need a super lean implementation. Django's too heavy for that.
In some ways a good thing, no? Shows you've got work to do on optimisation for large audiences. A free stress test (unless you're on a host that charges per hit or bandwidth excess), as you will.
Did load eventually for me, thought it was broken as no styles but looks like it's intentional.
I made a CGI program that ran CSS selectors against URLs and returned the output. I debated making it public and then realized I probably didn't want to run an open proxy. I'm curious how long this will last.
Dates shouldn't matter. The feed has ID elements which is what identify entries. Atom has no guid element. So I would expect this to work with any reader.
Not the same but this gives me an idea… what if there was a map reduce for doms as a web primitive. Like imagine if I could make a dom (or feed) that was some selection and transformation of another dom
That is a good idea.
59 requirements, including Django, seems pretty heavy though?
For my own RSS feed, I use this 48 line Python file with no dependencies outside the standard library:
https://github.com/no-gravity/atomfeed.py
It takes an array with the entries as input, not a web page. But I guess the HTML parsing should take no more than another few lines? For HTML parsing, I have good experiences with the lxml module which is in the Debian repos. It is fast and works pretty well.
Glad you’re find the tool interesting! A short blog post behind it: https://kschaul.com/post/2023/04/16/feedmaker-quickly-genera...
And the GitHub url (hopefully easy to host your own instance): https://github.com/kevinschaul/feedmaker
Looks like you're hosting this on fly.io - PAYG model. You could probably host this for free on Cloudflare Workers; 100k requests/day on the free tier; static content (the homepage) is free & unlimited.
Edit: The catch is the 10ms CPU cap per request - you'd need a super lean implementation. Django's too heavy for that.
Well, someone already did with JS: https://github.com/ProfessorManhattan/rss-worker
Python alone is many milliseconds to start. Unless they give you some allowances for interpreter overhead.
The good news: made it to the front page.
The bad news: so did the 503 page.
In some ways a good thing, no? Shows you've got work to do on optimisation for large audiences. A free stress test (unless you're on a host that charges per hit or bandwidth excess), as you will.
Did load eventually for me, thought it was broken as no styles but looks like it's intentional.
Seems to be hosted using fly.io
https://github.com/RSS-Bridge/rss-bridge is what I've been using for the same purpose.
Should be able to achieve this without selectors with HTML to Markdownish (something like Firefox's Reader mode).
I made a CGI program that ran CSS selectors against URLs and returned the output. I debated making it public and then realized I probably didn't want to run an open proxy. I'm curious how long this will last.
I love this.
Has anyone tested to see if it works with Blogtrottr which will email you whenever there's a new item in an RSS feed?
Just since this doesn't seem like it even includes a date field in the RSS? And of course no guid. So I'm wondering how compatible it winds up being.
Dates shouldn't matter. The feed has ID elements which is what identify entries. Atom has no guid element. So I would expect this to work with any reader.
I wish they had concrete, accurate id and created_at. IIRC these attributes are fixed in AT.
Not the same but this gives me an idea… what if there was a map reduce for doms as a web primitive. Like imagine if I could make a dom (or feed) that was some selection and transformation of another dom
You have just re-invented XLST.
*XSLT
https://www.w3schools.com/xml/tryxslt.asp?xmlfile=cdcatalog&... give it a whirl!