A Place for Wordles

If you are following this blog with RSS, you may have noticed some Wordles and other similar content showing up. I’ve excluded that again for now. I’ve been working on a sort of mini project I’ve wanted to set up for a while, and I even considered trying to write my own plug-in to do it, but it turns out there already is one.

Basically, Micro-Blog style posts, within this larger blog. In the past, this is the sort of thing I would have spun up an entire second WordPress Instance for, and then quickly forgotten about it after ten posts, and then folded it into the main blog, maybe. Ever since Twitter died, I have not really had a place for these sorts of posts, and frankly, I also never really cared to post them to places like Twitter anyway. Things like Wordle scores, the little image banners Duolingo spits out, a random Youtube Video with maybe some notes, or song lyrics, etc. I also have some random pictures of food that I’ve stuck in the Microblog section.

Basically, stuff I kind of want to keep a record of, but that doesn’t need a whole post or fanfare and that frankly, no one else probably cares about. It’s all there on that page linked above. I’m still messing with the formatting a bit.

What I’m using is a “secret backend page” with a plug-in called User Submitted Posts. It’s intended to let users submit posts easily to the blog, but as no one else should be able to register to post, and I’ve set the settings limited to admins, and the page is “secret”, I am the only one submitting. I also streamlined down all the settings. Everything goes into the Microposts category, and it only has boxes for a title, tags, post content, and uploads, all optional. It’s about as close to making a Microblog clone with WordPress as I can get. I’ve also verified that users not logged in can’t see or get to the posting page.

It’s also simple enough to set up a shortcut to the posting page on the home screen of my phone. In the pasty, I had tried using the WordPress App, but it’s just not suitable for quick posting of this style. You have to select tags and categories and go through menus, it’s just clunky for this purpose.

Code Project: Fresh RSS to WordPress Digest V 2

A while back, I talked about a little simple project that I build that produces a daily RSS digest post on this blog. This of course broke when my RSS Reader died on me. I managed to get Fresh RSS up and running again in Docker, and I’ve been slowly recovering my feeds, which is incredibly slow and tedious to do because there are a shitload of feeds, and i essentially have to cut and paste each URL into FreshRSS, and select the category and half the time they don’t work, so I need to make a note of it for later checking and it’s just… slow.

But since it’s mostly working, I decided to reset up my RSS poster. I may look into setting up a Docker instance just for running Python automations, but for now, I put it on a different Pi I have floating around that plays music. The music part will be part of a different post, but for this purpose, it runs a script, once a day, that pulls a feed, formats it, and posts it. It isn’t high overhead.

While poking around on setting this up, I decided to get a bit more ambitious and found out that, basically every view has it’s own RSS feed. Previously, I was taking the feed from the Starred Articles. But it turns out that Tags each have their own feed. This allowed me to do something I wanted from the start here, which is create TWO feeds, for both of my blogs. So now, articles related to Technology, Politics, Food, and Music, get fed into Blogging Intensifies, and articles related to toys, movies, and video games, go into Lameazoid.

I’ve also filtered both of these out of the main page. I do share these little link digests for others, if they want to read them, but primarily, it’s a little record for myself, to know what I found interesting and was reading that day. This way if say, my Fresh RSS reader crashes, I still have all the old interesting links available.

The other thing I wanted to do was to use some sort of AI system to produce a summary of each article. Right now it just clips off the first 200 characters or so. At the end of the day, this is probably plenty. I’m not really trying to steal content, I just want to share links, but links are also useful with just a wee bit of context to them.

I mentioned before, making this work involved a bit to tweaking to the scrips I was using. First off is an auth.py file which has a structure like below, one dictionary for each blog, and then each dictionary gets put in a list. Adding additional blogs would be as simple as adding a new dictionary and then adding the entry to the list. I could have done this with a custom Class but this was simpler.

BLOG1 = {
    "blogtitle": "BLOG1NAME",
    "url": "FEEDURL1",
    "wp_user": "YOURUSERNAME",
    "wp_pass": "YOURPASSWORD",
    "wp_url": "BLOG1URL",
}

BLOG2 = {
    "blogtitle": "BLOG2NAME",
    "url": "FEEDURL2",
    "wp_user": "YOURUSERNAME",
    "wp_pass": "YOURPASSWORD",
    "wp_url": "BLOG2URL",
}

blogs = [BLOG1, BLOG2]

The script itself got a bit of modification as well, mostly, the addition of a loop to go through each blog in the list, then some variables changed to be Dictionary look ups instead of straight variables.

Also please excuse the inconsistency on the fstring use. I got errors at first so I started editing and removing the fstrings and then realized I just needed to be using Python3 instead of Python2.

from auth import *
import feedparser
from wordpress_xmlrpc import Client, WordPressPost
from wordpress_xmlrpc.methods.posts import NewPost
from wordpress_xmlrpc.methods import posts
import datetime
from io import StringIO
from html.parser import HTMLParser

cur_date = datetime.datetime.now().strftime(('%A %Y-%m-%d'))

### HTML Stripper from https://stackoverflow.com/questions/753052/strip-html-from-strings-in-python
class MLStripper(HTMLParser):
    def __init__(self):
        super().__init__()
        self.reset()
        self.strict = False
        self.convert_charrefs= True
        self.text = StringIO()
    def handle_data(self, d):
        self.text.write(d)
    def get_data(self):
        return self.text.getvalue()

def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()

# Get News Feed
def get_feed(feed_url):
    NewsFeed = feedparser.parse(feed_url)
    return NewsFeed

# Create the post text
def make_post(NewsFeed, cur_blog):
    # WordPress API Point
    build_url = f'https://{cur_blog["wp_url"]}/xmlrpc.php'
    #print(build_url)
    wp = Client(build_url, cur_blog["wp_user"], cur_blog["wp_pass"])

    # Create the Basic Post Info, Title, Tags, etc  This can be edited to customize the formatting if you know what you$    post = WordPressPost()
    post.title = f"{cur_date} - Link List"
    post.terms_names = {'category': ['Link List'], 'post_tag': ['links', 'FreshRSS']}
    post.content = f"<p>{cur_blog['blogtitle']} Link List for {cur_date}</p>"
    # Insert Each Feed item into the post with it's posted date, headline, and link to the item.  And a brief summary f$    for each in NewsFeed.entries:
        if len(strip_tags(each.summary)) > 100:
            post_summary = strip_tags(each.summary)[0:100]
        else:
            post_summary = strip_tags(each.summary)
        post.content += f'{each.published[5:-15].replace(" ", "-")} - <a href="{each.links[0].href}">{each.title}</a></$                        f'<p>Brief Summary: "{post_summary}"</p>'
        # print(each.summary_detail.value)
        #print(each)

    # Create the actual post.
    post.post_status = 'publish'
    #print(post.content)
    # For Troubleshooting and reworking, uncomment the above then comment out the below, this will print results instea$    post.id = wp.call(NewPost(post))

    try:
        if post.id:
            post.post_status = 'publish'
            call(posts.EditPost(post.id, post))
    except:
        pass
        #print("Error creating post.")

#Get the news feed
for each in blogs:
    newsfeed = get_feed(each["url"])
# If there are posts, make them.
    if len(newsfeed.entries) > 0:
        make_post(newsfeed, each)
        #print(NewsFeed.entries)

Purging WordPress .ico Malware… (Hopefully)

So, this is a hopefully, because it’s been a bit since I have done this, and things seem to be clean. So, there is a reasonably common bit of Malware out there that seems to affect WordPress sites, I say reasonably common, because in my time looking for a solution, I have come across a fair number of others with the issue, but no solutions. And I have tries several solutions. As of now, I have been a few months clean, and without hacky work-arounds. I’m going to attempt to run through what I did that held it at bay, and what seems to have finally managed to purge it, in hopes of helping others.

The Malware itself basically would occasionally redirect the blog domain to a spam website. I say occasionally, because it’s not all the time, and with enough anti-advertising stuff in your browser, you may never see it happen. I have personally, never once seen it happen, on any of the sites I run on this web space. I first found it was infected because occasionally, my wife would mention that someone she had linked her blog to, was getting sent to a spam website. Initially I thought maybe someone was mistyping the domain along the line. My wife also said it would happen occasionally. In my work combating this malware, it seems like the actual redirect occurs slowly over time, as the infections spreads.

It also will spread across sites hosted on the same server. Which made it extra tricky to fight because I had to juggle several sites at once.

Part 1 – Keeping It at Bay and How it Spreads

I have no idea how the infection was initially started, which is rough, because that would be key to KNOWING it’s gone. As near as I can tell, the initial source of the infection is int he uploads directory of a blog. It eventually starts to add “gibberish code” to files like wp-config.php and settings.php. I say “Gibberish Code” because it’s actual PHP, but it’s very messy in it’s design and encoding to make it hard to read to locate files. The gibberish code would generally show up at the top of the files, but could be elsewhere.

Eventually, random folders would start showing up in the root WordPress directory, sometimes with gibberish names, sometimes with specific spammy sounding names, sometimes with names that appear to be part of the blog (like ‘site’ or ‘blog’).

The first step in holding this at bay was so dump all write permissions for several critical WordPress files that kept being infected. This seemed to only sort of help, the problem was more that the owner, www-data, still could write to the files.

The next step was to convert all of the web files to an alternative user account as the owner, then set the files so www-data could only read. This created a new problem, it meant I could not easily update anything or upload images for blog posts easily. Since www-data had no permissions to write anywhere. If I was making a new post, I would have to SSH into the server, temporarily change the permissions, then change things back.

Pain in the ass.

My temporary fix there, was to keep the current year as writable, and run a script that would probe for malware files and delete them. There were two scripts, one for the hidden .ico files that would crop up and one for any .php files that were in the Uploads folders. Both run with a cron job.

#!bin/sh
/usr/bin/find /var/www/html -name ".*.ico" -exec rm {} +

This is, admittedly, an EXTREMELY Hack way to correct this problem. Hack less in the sense of “computer hacker”, hack in the sense of “jankey or shoddy”. But it worked, while I figured out the root issues.

Part 2 – Fixing the Issue

Eventually I sat down and just sort of rebooted everything, all at once. I started with everything set up as an alternative user permissions and locked down. I then scrubbed out all infected php and ico files from the upload folders. I then thoroughly scrubbed out the wp-config files. Basically, any files I would need to moved to a fresh WordPress install, which was the uploaded images, and the configurations, were completely sanitized.

Next, I downloaded a fresh copy of WordPress, expanded it out and made copies for each site folder with websitename_new. After that, I copied the uploads folders and wp-config files to the new copies. Then i renamed each current folder as websitename_old, and renamed the new ones as simply websitename. (I actually did this and the subsequent steps once at a time for each site). This made the new, fresh installed copies live.

Except they have no plug ins and no themes. I did not transfer any old theme files or plug in files, for worry of infection. Instead I went into the old folders to get a list, then redownloaded each theme and plug in to the fresh copies. This meant doing some reconfiguring but it was worth it for clean copies. I also left out anything that wasn’t absolutely essential to the basic look and operation.

Site note, when a fresh install copy is made live, it may not load until you go into /wp-admin, and change the theme to literally anything else (Generally, the current year WordPress Default works). The config files will still be looking for the non-existent old theme.

With everything fresh and ready to go, I deleted the old potentially infected copies, to ensure the infection was now completely purged. After that, I created a backup folder, and copied all of the current fresh versions of the site folders to the back ups. This way, in the event of a reinfection, I could simply, slap a fresh back up in place easily. It might be missing a few recent images, but it would be way less work.

Still worried, I then reverted the permissions for the sites back to www-data, but I did them one at a time, roughly a week apart. Carefuly checking for reinfection with each change.

So far so good, I have not had any problems. Here is hoping it stays.