Code Project: Automated List From Reddit Comments

This is one of those quick and kind of dirty projects I’ve been meaning to do for a while. Basically, I wanted a script that would scrape all of the top level comments from a Reddit post and push them out to a list. Most commonly, to use on /r/AskReddit style threads like, well, for this example, “What is a song from the 90s that young people should listen to.”

Basically, threads that ask for useful opinions on list. Sometimes it’s lists of websites or something. Often it’s music. The script here is made for music but could be adjusted for any thread. Here is the script, I’ll touch on it a bit in more detail after.

## Create an APP for Secrets here:
## https://www.reddit.com/prefs/apps

import praw

## Thread to scrape goes here, replace the one below
url = "https://www.reddit.com/r/Music/comments/10c4ki0/name_one_90s_song_kids_born_after_2000_should_add/"

## Fill in API Information here
reddit = praw.Reddit(
    client_id="",
    client_secret= "",
    user_agent= "script by u/", # Your Username, not really required though
    redirect_uri= "http://localhost:8080",
)


submission = reddit.submission(url=url)
submission.comments.replace_more(limit=0)
submission.comment_limit = 1

for x in submission.comments:
    with open("output.txt", mode="a", encoding="UTF-8") as file:
        if "-" in x.body:
            file.write(str(x.body)+"\n")
            # print(x.body)

The script uses praw, Python Reddit API Wrapper. A Library made for use in Python and the Reddit API. It requires free keys which can be gotten here: https://www.reddit.com/prefs/apps. Just create an app, the Client ID is a jumble of letters under the name, the secret is labeled. User Agent can be whatever really, but it’s meant to be informative.

The thread URL also needs filled in.

The script then pulls the thread data and pulls the top level comments.

I’m interested in text file lists mostly, though for the sake of music based lists, if I used Spotify, I might combine it with the Spotify Playlist maker from my 100 Days of Python course. Like I said before though, this script is made for pulling music suggestions, with this but of code:

        if "-" in x.body:
            file.write(str(x.body)+"\n")
            # print(x.body)

It’s simple, but if the comment contains a dash, as in “Taylor Swift – Shake it Off” or “ACDC – Back in Black”, it writes it to the file. Otherwise it discards it. There is a chance it means discarding some submissions, but this isn’t precision work so I’m OK with that to filter out the chaff. If I were looking for URLs or something, I might look for “http” in the comment. I could also eliminate the “if” statement and just have it write all the comments to a file.

100 Days of Python, Projects 66-70 #100DaysofCode

Whew, I didn’t really think I’d get to 9 parts in this series, and I am only around 2/3rds of the way through even.  I actually may change up the format later with the last 20 projects that are listed as “Professional”.  Maybe one post each.

The bulk of this round is wrapping up the Flask projects and building a simple blog that runs on Python.  It’s been fun.  I’ve been a bit busier than normal slow my pace has slowed, but that’s ok too.  Day 66 in particular felt like it took longer than it really should have, given how little it felt like it was doing.

Day 66 – Build RESTful APIs

Kind of a different sort of project.  It’s building something, but not really anything, with any sort of interface.  All of the interaction is done through Postman (or the URL if one wants), and the responses are all JSON of some sort.  

We built 6 API interfaces to work with a database of Coffee shops.

  • One returned all of the shops.
  • One returned all of them in a particular city.
  • One returned a specific shop only.
  • One updated the price of coffee.
  • One added a new Coffee Shop.
  • One deleted a Coffee Shop.

Including built in error handling for if a Coffee Shop didn’t exist etc.  It all feels like it could be useful later in the Blog Project.  Not too useful on it’s own accord.

Day 67 – Blog Capstone Project Part 3

Back to the Blog Project, and after this round it’s a LOT more Blog Like, though incredibly insecure.  The Security Part looks to be the subject of the next couple of lessons though, so it’s all good.  We basically started with the last Blog Project from Day 59, and added the ability to add, edit, and delete posts.  Also with a Database back end.

One tricky moment here.  The idea was to use the same form page for New Post and Edit Post.  Originally I had nested these together, and had some if/else cases to check if it was Editing or Adding, and kind of got stuck on how to handle the date, since the date doesn’t get changed on updates.

But then I had a bit of an epiphany moment, the kind you get often with coding, where you band against a problem, convinced it’s the way to solve things, only to realize there is an obvious, easy method.

I split them up.  No more if/else, and no more worrying about what the DB is doing at the end.  Because two separate @app.route and related functions, can route to the same HTML template (the Edit Form).  The only thing needed was to pass a header variable along to change the title.  

It was a total “Duuuuuuh” moment.

Anyway, no security though, because there are no users or security keys and anyone can just come in and post anything and delete anything and it would be total chaos as an actual production Blog platform.

Day 68 – Authentication with Flask

Oops, I spoiled what this topic was going to be in the last project write up.  This whole section is working with Authentication using a Database for persistence.  It’s a simple website, a home page, a user creation page, a log in page, and a secrets page.  Users can create an account, and once registered, they can access the Secrets page and download a PDF.  

Part of this exercise is also restricting access for not logged in users and another part is proper security and handling passwords with hashing and salting.

Of all of the projects so far, I feel like this may be one of the most important ones, despite it’s basic simplicity.  I’ve been planning to work out the best way to share some of the projects built so far, and using Flask was a good “first step”, but being security conscious, I would rather not throw a bunch of web based Flask pages up with no restrictions on access.  Like last session, there was the Library App or the Top Ten Movies page.  Sure, I could put them up on the web-server, map some sub domain to them, but anyone can edit them, without proper, persistent security and restricted access.  

Now, I can make that happen.  Plus, with the modulator of routing and such, I can easily slip them into parts of the overall Blog Project, in time.

Day 69 – Blog Capstone Project Part 4

And here it is, the culmination of the Python based Flask Blog.  It’s neat.  I like it, well, I like the basic functionality of it.  The layout is a bit plain but that’s fixed.  I doubt it ever replaces WordPress, I love WordPress, but I could definitely see uses for this finished product.  I may actually modify it to work as a sort of “Twitter Replacement” since Twitter is currently burning down.  Using what I have learned, I could easily set this up to take a post, then “Syndicate” it out via API calls to Twitter, Facebook, Mastodon, or anywhere.  While keeping my own archive of posts.

My current, next up To Do Items:

  • Create RSS Feed Page  
  • Create Admin Page  
  • Enable/Disable Comments for Posts  
  • Allow Admin to delete Comments  
  • Allow Users to delete their own comments  
  • Add Mailer.class and add email notifications  
  • Add Pagination to Home Page  
  • Add Tags and Category Options to Posts

Also, on a side note, this actually isn’t the first time I’ve built a “Blog Platform”. I built a basic one a few years ago in HTML and PHP. Heck, the system I used int he early 2000s with SHTML Pages was sort of a “Blog Platform”.

I’d recommend, for anyone going through this course, check the notes on the Blog courses.  There are a lot of good suggestions for improvements, especially in stopping things like Java Script injection in the comments.  When you go to Angela’s (The Instructor) Blog link, it immediately throws out a JavaScript pop up that someone dropped in the comments.

Day 70 – Hosting with Heroku

So, there isn’t really a Day 70 project.  I did go through the section, and the most useful part for me was the more robust SQL solution mentioned in the last section.  The first few parts were about GIT and Github, which I am already using.  The middle part was about using Heroku, which I have heard of and used a bit before but for the long term, I don’t need to use a freemium hosting service in a jankey way.  I have a VPS, and later I will figure out how to get Apache to play nice with Flask.

For now, I’ve split the Blog Project into it’s own repository and worked up my initial planned ToDo List.  But I need to keep my focus on the course for now.

FreshRSS and RSS Feed Posts

Keen observers (ha ha ha no one reads this), might have noticed that a few posts of links showed up in the feed.  These are basically, stories I read in my RSS reader that I found interesting, and wanted to share, or at least, keep track of.  The posts as of now are a little ugly, and I’ll probably clean up the formatting over time, but I wanted to go ahead and write a bit about the process.  I’ll have the Code on Github at some point.

As for the factors, firstly, this is something I’ve wanted to have on my blog for a while.  Like a long while.  I might even try to see if there are ways to better slit up the links by topic later.  A fair number of blogs I subscribe to have these sort of link digest posts, and I’ve always just liked the idea.  It’s also good for personal reference to when I may have read something.  It is limited as it only comes from y RSS Reader.

Speaking of my RSS Reader.  I’ve moved on from TinyTinyRSS, for a few reasons.  One, the interface is a little meh, honestly.  Maybe the newer version is better but it’s only available in Docker, and Docker is such a PItA to use.  Also, while looking for alternatives, it sounds like the folks who make TTRSS are kind of a bunch of gatekeeping jerk types, and I’d rather not support that.  I also find the need to keep the update daemon running with Screen to be a pain.  So I’ve moved over to FreshRSS, which I just run locally on a Raspberry Pi.  I may move it to a publicly accessibly machine at some point, but I am not entirely convinced that TT-RSS wasn’t the entry point for my previous server malware woes.

So, like TT-RSS, Fresh RSS has a way to get an RSS feed out of your Favorited posts.  In the past I’ve used tools like IFTTT to automate posting these links around, but I don’t use IFTTT anymore for reasons I’m not going into.  Fortunately, I’ve been working to become a pretty good Python coder for the last month or so.  So instead I wrote a script.  

It’s not even a particularly complicated script.  There are only two things it really needs to do, get new articles, and then post them to WordPress. Since the script runs locally, on the same Raspberry Pi even, it easily can reach and pull the RSS feed.  One nice thing I noticed with Fresh RSS, the feed included a time interval, so just getting new posts was super simple, because the interval is just “24” for “24 hours”.  The script eventually will run on a cronjob at the exact same time daily.  Anyway, after pulling the RSS, the entries are already in an easily usable Dictionary.  which gets fed into the construction of the WordPress Post.

def get_feed(feed_url):
    NewsFeed = feedparser.parse(feed_url)
    return NewsFeed

The posting part was pretty easy as well, WordPress has an API, and Python also has a library that can use that API.  It just needs some log in information and a post payload to send.  

def make_post(NewsFeed):
    wp = Client(f'https://{wp_url}/xmlrpc.php', wp_user, wp_pass)
    post = WordPressPost()
    post.title = f"{cur_date} - Link List"
    post.terms_names = {'category': ['Link List'], 'post_tag': ['links', 'FreshRSS']}
    post.content = f"<p>Blogging Intensifies Link List for {cur_date}</p>"
    for each in NewsFeed.entries:
        post.content += f'{each.published[5:-15].replace(" ", "-")} - <a href="{each.links[0].href}">{each.title}</a></p>'

The trickiest part was formatting the date a bit prettier.  I mentioned cleaning up the formatting a bit, I’m thinking maybe a simple invisible table, so the date and the links don’t wrap oddly like they do now.   i also added a check that if there are no new favorited posts, it will skip making a post.  Otherwise I’ll end up with empty posts on days I forget to check my feed reader

While writing the script, at first I was just outputting a text copy of the post to the console until satisfied.  Eventually, I pushed out a real post, then verified that things worked.  The next day, was just a straight test by opening the project, then running it again.  The third day, I copied the files and installed the lobraries needed, then posted from the Pi.  Phase 4 of this will be to set up Cron to run it automatically.  If that works then it will certainly, “just run” for the foreseeable future.

100 Days of Python, Projects 15-22 #100DaysofCode

So, the first set of projects for were all fairly simple.  Basic Text based programs that run in a terminal and run through simple loops.  The Intermediate Section starting on day 15 of the course is were things started to get quite a bit more interesting, though the basic code isn’t really all that complex yet.

There are two main topics covered during the Intermediate portion of the course, creating GUI interfaces with Turtle Graphics, and some introductory data analysis.  These two topics don’t particularly overlap, but both seem to be the primary focus here.  I’m rather enjoying the use of graphics over just terminal applications myself, though int he long run, handling CSV and other data feeds will probably be a lot more useful.

In the interest of brevity, I’m going to split the intermediate section up into a couple of posts.  I expect this to become more common as the projects become more complex and frankly, by the end, each project may even get it’s own post.

Anyway, on with the projects…  As before, all of the code is stored in this GitHub Repository.

Project 15 – The Coffee Machine Project

The actual focus for Day 15 was to set up PyCharm, and get away from using Replit for the code projects.  I’d already been using VS Code half the time anyway, but I switched over to PyCharm on this day.  A few reasons, one, it’s what the course is using, so it’s easier to follow if needed, two, it was easier to import libraries into projects than with VS Code.

The Coffee Machine project is a simple project that takes an order for a coffee, takes some coins, then spits out change and a coffee.  It doesn’t ACTUALLY do any of this physically, but if it did, that would be super impressive, manifesting physical objects with a laptop.

Project 16 – The Coffee Machine Project

Nope, you aren’t reading that wrong, the next day was the same project.  The difference was, Day 16 was also an introduction to Object Oriented Programming.  So for this project the students get some files with pre-made functions in them, and we build the same Coffee machine using these functions in an Object Oriented Programming way.

This was the first time I’ve actually learned something new in this course.  All of my code before has essentially just been “one file”.  The concept of breaking things into files and classes that do specific tasks is pretty neat, and I’ve gotten pretty good at it.  That said, I can also see where it’s still a good idea fo make a judgement of if something should be it’s own class or just be part of the base program.

There was also a brief introduction to using Pretty Tables to display data and Turtle Graphics, though the final program didn’t use any graphics.

Project 17 – Quiz Game Project

Another Object Oriented project, though instead, we make everything this time.  The questions were provided, but they are set up in a way that it’s simple to replace them with new questions from Open Trivia.  This also really helps push how useful it can be to break a program apart like this, across files.  The questions aren’t a class, they are just a dictionary you import, but they can easily be quickly replaced and the same code will run on any set of questions.

Project 18 – Hirst Painting Project

This one was a more in depth and proper look at Turtle Graphics.  Turtle is Python’s built in method for creating graphics and windows.  I’m sure there are probably others out there, but this works pretty well for a simple interface.  

The first practice was making the Turtle do some things, one of which was a pretty neat Spirograph drawer, which I may revisit in the future to make it useful.  Maybe have a pop up for how large or how many spirals to make.

The Hirst Painting project was inspired by the artist Damien Hirst, who apparently once sold some paintings consisting only of regularly placed dots.  The program takes an input image, extracts a color pallet from it, then creates a similar dot based image.  The instructor used a Hirst painting as the input, I opted to use the cover of the CHVRCHES album Every Open Eye.  It’s a pretty neat result.  It’s another that could be made more robust by prompting for an input image, how many colors to extract and how many dots to draw.

Project 19 – Turtle Racing and Etch-A-Sketch

This day was essentially two projects.  One was a simple Etch-A-Sketch style drawing program, that served as an introduction to actually controlling the Turtle with keyboard inputs.  

The second I particularly enjoyed, though it wasn’t a very complex game.  The second was Turtle Racing.  The purpose was to demonstrate how you can reuse classes to manage several variables of the same class type.  It spawns 6 turtles, you guess which will win, then they race across the screen to the finish line.

Now why is this exciting?  Waaaaaay back in the year 2000, I had a semester to kill between Community College and University, so I took a Computer Science 101 course where I learned some C++ programming.  One of the projects for that class, was a Horse Racing game, that is essentially the same concept.   The Horse Racing Game can be found here, and there is even an exe that lets you play it.  https://github.com/RamenJunkie/C_and_CPP_Code_Snippets/tree/main/CPP%20Code/Horse%20Race

Project 20 and 21 – Snake Game

The first multi day project of the course, well, ok, the Coffee Machine was SORT OF Multi Day, but not really.  This project recreates the classic game Snake.  I’m sure it existed before, but it was popularized by being included on Nokia Phones back in the 90s and early 2000s.

You control the snake, eating food to get longer, and avoiding running into the wall or yourself.  I am pretty happy with the result other than the controls feel like they could be a bit more responsive.

One thing I am proud of with the Snake Game though, I noticed in the instructor’s examples, she was constantly missing the food just barely. In her code, the food just randomly spawns. I changed the code a bit so the food always spawns on the same grid the Snake runs on, which makes it way more reliable to pick up. I also added a border, but I’m less satisfied with that because it seems Python renders things a little funny and slightly off center. I suspect that Python Turtle things are drawn from one corner of the coordinates, and not centered on the coordinates.  

Project 22 – Pong

I’m rather proud of this one, so we’ll cap this round of write ups off with it.  the Day 22 project was to recreate Pong, often cited as the “first video game”.  It’s simple, two players each have a paddle on opposite sides of the screen, the ball bounces back and forth, missing the ball grants score to the other player.

This project combines quite a few things leading up to this point from creating custom classes to taking inputs, drawing graphics. Ok, so technically the Snake Game did all of this. My real win here is doing it, entirely on my own, before watching ANY of the class videos for the day. (Ok, I may have watched the intro just to get the scope of the project, sometimes the instructor throws some curve-ball concepts in.)

Now, don’t get me wrong, in most cases, I do the coding on my own, based on what’s presented, but I usually follow along with the class and do each bit as it’s asked.  I also will occasionally “correct” my code to better align with what’s presented, not because I think my methods are wrong, but more because the instructor may later introduce a concept that needs the code to be set up a particular way.

For Pong, I did it all.  I am plenty familiar with how Pong works, I viewed the 2 minute “What we are making” first video to get the scope, then went off on my own.  And it all worked out.  Getting the ball bounce just right was the trickiest part, mostly because I started seriously over thinking the physics of it.

It really helps that I already know how to code, and have a “programmer mindset” on how to step through things.

  • Make a Paddle Class
  • Spawn a Paddle
  • Make the Paddle Move
  • Keep the Paddle within the screen constraints.
  • Spawn a second Paddle
  • Make the second paddle move with different keys
  • Draw the net
  • Make a ball
  • Make the ball move
  • Make the ball bounce withing the screen
  • Make the score board
  • Make the Scoreboard increment when the ball bounces off of left and right
  • Make the ball re-spawn when it hits left or right instead of bounce
  • Make the ball bounce off of a paddle

That’s pretty much it, the steps to making Pong.   Most of them are super easy.  I even added an additional class that would draw a “net” across the center of the screen that wasn’t required (not that anyone is grading this code).

The point was, I did it all, then watched the videos, just to see if there was any better way to do some of the things I had done.  Felt like a pretty cool accomplishment.

Git Gud at Git

Something that I’ve have tried and fail to quite get a hang of is how to use Git, and more specifically, GitHub. I’ve had a GitHub account for a good while, and at one point I “had” a ton of repositories because instead of simply following something, I would fork it to make a copy of it.  I feel like that’s not quite the proper way to go about using things.  At some point I went through and removed all of these forks and was left with only my code projects.  

I still wanted to get a better feel for the proper flow, which I probably still don’t have, and won’t really ever get until I (if ever) have the chance to actually work with a team using Git.  Most of my projects are just single upload drops.  Almost everything I do I just revise and edit locally until it works, then if I want to “show it off” I scrub out any static IPs, usernames, and passwords, and upload it to a repository.  I suppose there really isn’t anything inherently wrong with this method.  I have, more recently been pushing myself to use Visual Studio Code (VSCode) lately and a few other code apps, and thus I’ve been pushing some repositories with those various program’s interfaces.  

I did get to go through the proper, Fork, Branch, Edit, Push process a few times while doing some lessons through Twilioquest, and a follow up afterwards.  

Still, I think my main take away is… it doesn’t really matter how the code gets there.   Especially for single person projects.  I don’t really have any overall “goal” here.  It feels like the common goal is to try to get a programming job, but I am not really sure I would want that, at least not anytime soon.  I mostly just want to show off some mediocre code to the nobody who would look at it.  

Speaking of “showing off”, I also found that the little games I had made last year with Microsoft’s MakeCode Arcade are all browser based, so I set up a little page to specifically show off Code Projects.  Right now, it’s all just MakeCode games, but I plan to add more if it seems like a good idea.

Another motivator for this push to tidy up my GitHub is it’s an excuse to practice using Markdown more.  Like Git, it feels like a skill I should have practiced more much sooner than now.  The main page for each repository is just a specific Markdown file, so having more repositories means I can make many pages of different types.  I’ve also started using Markdown to draft out blog posts as well as for my archive of old writings.

So anyway, if you want to see some random bits of code, some from 20+ years ago, feel free to poke around my growing pile of repositories.   Actual Repositories, where I did some level of work on them.

Lastly, this whole post is annoying full of things I wanted to blog about but have not.  Markdown, MakeCode, Twilioquest, my huge writing archive, to name a few.