2024.11.08 – Code Project – Python – Improved FreshRSS Link Lists
Today, I want to talk about recent improvements I have made on my FreshRSS to WordPress Digest Python script. And to make a note on what I would like to do next.
This is the script I used to produce these Link List Posts on [Blogging Intensifies] and Lameazoid. The Github Repository for it is here.
- The first version was simple, it pulled from the sharded feed of FreshRSS, collected favorited articles, formatted them a bit, then posted a wordpress post of links.
- Overtime, I wanted these posts to be prettier, so I added a bit more massaging to the formatting, and some HTML code so the links would show up in pretty little formatted boxes. I also decided some sort of summary would be useful, so it pulls the first 100 words or so from a feed item as a teaser.
- Initially I was using the main share feed from FreshRSS, but I have two blogs, each with vague but disting “themes”. Sharing video game news to [BI] felt a bit silly. Not that it matters, no one reads those lists and its mostly for my reference. I found you could narrow things down by personal tags, so altered the Python script to handle any number of configured blogs and now both get seperate link lists.
So, whats new this round?
Not a lot on the externally visible end. A week or so ago, I found that the Raspberry Pi I had running the script on had died on me. Or, more likely, the SD card did. Whatever the case, the script was not running as scheduled. I have always had a bit of a love/hate with the scheduled run. Some days I barely share anything from the reader, so it makes for weird small posts. Also it ran at like, 10:30P, which was kind of late, and occasionally, I found myself rushing to get through everything to make sure I flagged anything relevant, because I was flipping through it at 5 minutes till.
Right now, I am running it manually. And at irregular intervals. This created a new problem, but its one I had already been planning to fix. The way the FreshRSS shared feed works, you can append a number, X, and get “everything in the last X hours.” When it ran on a cronjob schedule every 24 hours, this number was simply, hardcoded at 24.
When manually running things, I needed X to be “however many hours since it last ran.” So now, it writes out a simple file with a time stamp, after each run. It also pulls in that file, and calculated the time difference. If the time difference is less than an hour, it defaults to an hour, because “X=0”, just gives the default feed, which may be everything, or may be the last ten items. I am not sure on the limits. If there isn’t a timestamp file, most likely if its being first run, it sets the hours different to 0, and gets everything.
Something else I added this round, everytime I wanted to do modifications, I needed to comment out some lines and uncomment others, so the script would not spam my blog with the same post over and over. Also, I have a time stamp file now that I don’t want to overwrite when testing, since it will probably stop seeing feed items unless they were marked in the last hour.
So I added a flag variable at the top, and encapsulated the business end output in some conditional statements. Now, when I want to test, I just change the “runmode” variable at the tol to “False”, and it stops posting or editing the time stamp file.
This was also needed for my third new feature. In addition to posting to the blog, it spits everything out into a simple, dated markdown file. This way, I have a private record of everything shared, since a lot of the point is, “I want to keep these links for reference.” Initially I just spit out the post data, but that was ugly since it was full of HTML tags. So It now compiles together a second, markdown formatted, variable, that gets written to the file. Another key difference, in my private files, I dump the entire article, and not just the 100 word summary.
I don’t want to repost entire articles on my blog, its rude and ugly, hence the summary, but for a private archive of text data, dumping it all is preferable. Articles disappear ALL the time, this is literally 100% of why I save and archive shit like this in the first place.
Some changes I still want to make
- Right now it dumps everything into one output file, I may split this across blogs/topics
- Another advantage of the private archive, I can add any number of additional tags to pull, that don’t have to be posted anywhere. I can just, pull them to a text archive. I already have started a recipes tag, for example. I also added a to be used flag in the config for it a feed gets posted anywhere.
- I am kind of down on the idea of AI, but I still may look into hooking the summary function to some sort of AI service to create actual summaries. In my very very vague testing, it had a hard time keeping it short, even when instructed to do so.
- I kind of want to modify the script to also produce a queue of links, then maybe a second script on a schedule that posts any links out of a file on microblogging services like Mastodon.
- I would love to find a way to share links I find elsewhere to FreshRSS or this script.
- I kind of want to find a way to sort and group posts under categories (Music, Coding, Video Games, etc). I have ideas oh how, but they are not all very… Elegant.
- I kind of dislike having the timestamp file, I would like to figure out a way to query the WordPress Blog itself for “The last post marked link list,” and go off of that as the “last run date.”
The “Holy Grail” want, is the ability to add comments to shares. I put in a suggestion on the Github page for a “Notes” feature. I am seriously just considering making my own plug in. This would be super useful for WHY I shared a link. I could use this later in the social sharing queue system as well. The idea would be, as an example, I tag a post for a new CHVRCHES album, then add a little note, “I am super looking forward to this!”, then on the digest, the comment would show up.