[Blogging Intensifies]

Technology, Projects, Linux, Coding, Internet of Things, Music, Books, Life...

  • About
  • Code Projects
  • Photo Gallery

Projects

Code Project: Automated List From Reddit Comments

January 16, 2023
Python Logo

This is one of those quick and kind of dirty projects I’ve been meaning to do for a while. Basically, I wanted a script that would scrape all of the top level comments from a Reddit post and push them out to a list. Most commonly, to use on /r/AskReddit style threads like, well, for this example, “What is a song from the 90s that young people should listen to.”

Basically, threads that ask for useful opinions on list. Sometimes it’s lists of websites or something. Often it’s music. The script here is made for music but could be adjusted for any thread. Here is the script, I’ll touch on it a bit in more detail after.

## Create an APP for Secrets here:
## https://www.reddit.com/prefs/apps

import praw

## Thread to scrape goes here, replace the one below
url = "https://www.reddit.com/r/Music/comments/10c4ki0/name_one_90s_song_kids_born_after_2000_should_add/"

## Fill in API Information here
reddit = praw.Reddit(
    client_id="",
    client_secret= "",
    user_agent= "script by u/", # Your Username, not really required though
    redirect_uri= "http://localhost:8080",
)


submission = reddit.submission(url=url)
submission.comments.replace_more(limit=0)
submission.comment_limit = 1

for x in submission.comments:
    with open("output.txt", mode="a", encoding="UTF-8") as file:
        if "-" in x.body:
            file.write(str(x.body)+"\n")
            # print(x.body)

The script uses praw, Python Reddit API Wrapper. A Library made for use in Python and the Reddit API. It requires free keys which can be gotten here: https://www.reddit.com/prefs/apps. Just create an app, the Client ID is a jumble of letters under the name, the secret is labeled. User Agent can be whatever really, but it’s meant to be informative.

The thread URL also needs filled in.

The script then pulls the thread data and pulls the top level comments.

I’m interested in text file lists mostly, though for the sake of music based lists, if I used Spotify, I might combine it with the Spotify Playlist maker from my 100 Days of Python course. Like I said before though, this script is made for pulling music suggestions, with this but of code:

        if "-" in x.body:
            file.write(str(x.body)+"\n")
            # print(x.body)

It’s simple, but if the comment contains a dash, as in “Taylor Swift – Shake it Off” or “ACDC – Back in Black”, it writes it to the file. Otherwise it discards it. There is a chance it means discarding some submissions, but this isn’t precision work so I’m OK with that to filter out the chaff. If I were looking for URLs or something, I might look for “http” in the comment. I could also eliminate the “if” statement and just have it write all the comments to a file.

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on Pinterest (Opens in new window)

Like this:

Like Loading...
Posted in: Programming Projects Tagged: music, Programming, Projects, Python, Reddit

100 Days of Python, Projects 66-70 #100DaysofCode

November 16, 2022

Whew, I didn’t really think I’d get to 9 parts in this series, and I am only around 2/3rds of the way through even.  I actually may change up the format later with the last 20 projects that are listed as “Professional”.  Maybe one post each.

The bulk of this round is wrapping up the Flask projects and building a simple blog that runs on Python.  It’s been fun.  I’ve been a bit busier than normal slow my pace has slowed, but that’s ok too.  Day 66 in particular felt like it took longer than it really should have, given how little it felt like it was doing.

Day 66 – Build RESTful APIs

Kind of a different sort of project.  It’s building something, but not really anything, with any sort of interface.  All of the interaction is done through Postman (or the URL if one wants), and the responses are all JSON of some sort.  

We built 6 API interfaces to work with a database of Coffee shops.

  • One returned all of the shops.
  • One returned all of them in a particular city.
  • One returned a specific shop only.
  • One updated the price of coffee.
  • One added a new Coffee Shop.
  • One deleted a Coffee Shop.

Including built in error handling for if a Coffee Shop didn’t exist etc.  It all feels like it could be useful later in the Blog Project.  Not too useful on it’s own accord.

Day 67 – Blog Capstone Project Part 3

Back to the Blog Project, and after this round it’s a LOT more Blog Like, though incredibly insecure.  The Security Part looks to be the subject of the next couple of lessons though, so it’s all good.  We basically started with the last Blog Project from Day 59, and added the ability to add, edit, and delete posts.  Also with a Database back end.

One tricky moment here.  The idea was to use the same form page for New Post and Edit Post.  Originally I had nested these together, and had some if/else cases to check if it was Editing or Adding, and kind of got stuck on how to handle the date, since the date doesn’t get changed on updates.

But then I had a bit of an epiphany moment, the kind you get often with coding, where you band against a problem, convinced it’s the way to solve things, only to realize there is an obvious, easy method.

I split them up.  No more if/else, and no more worrying about what the DB is doing at the end.  Because two separate @app.route and related functions, can route to the same HTML template (the Edit Form).  The only thing needed was to pass a header variable along to change the title.  

It was a total “Duuuuuuh” moment.

Anyway, no security though, because there are no users or security keys and anyone can just come in and post anything and delete anything and it would be total chaos as an actual production Blog platform.

Day 68 – Authentication with Flask

Oops, I spoiled what this topic was going to be in the last project write up.  This whole section is working with Authentication using a Database for persistence.  It’s a simple website, a home page, a user creation page, a log in page, and a secrets page.  Users can create an account, and once registered, they can access the Secrets page and download a PDF.  

Part of this exercise is also restricting access for not logged in users and another part is proper security and handling passwords with hashing and salting.

Of all of the projects so far, I feel like this may be one of the most important ones, despite it’s basic simplicity.  I’ve been planning to work out the best way to share some of the projects built so far, and using Flask was a good “first step”, but being security conscious, I would rather not throw a bunch of web based Flask pages up with no restrictions on access.  Like last session, there was the Library App or the Top Ten Movies page.  Sure, I could put them up on the web-server, map some sub domain to them, but anyone can edit them, without proper, persistent security and restricted access.  

Now, I can make that happen.  Plus, with the modulator of routing and such, I can easily slip them into parts of the overall Blog Project, in time.

Day 69 – Blog Capstone Project Part 4

And here it is, the culmination of the Python based Flask Blog.  It’s neat.  I like it, well, I like the basic functionality of it.  The layout is a bit plain but that’s fixed.  I doubt it ever replaces WordPress, I love WordPress, but I could definitely see uses for this finished product.  I may actually modify it to work as a sort of “Twitter Replacement” since Twitter is currently burning down.  Using what I have learned, I could easily set this up to take a post, then “Syndicate” it out via API calls to Twitter, Facebook, Mastodon, or anywhere.  While keeping my own archive of posts.

My current, next up To Do Items:

  • Create RSS Feed Page  
  • Create Admin Page  
  • Enable/Disable Comments for Posts  
  • Allow Admin to delete Comments  
  • Allow Users to delete their own comments  
  • Add Mailer.class and add email notifications  
  • Add Pagination to Home Page  
  • Add Tags and Category Options to Posts

Also, on a side note, this actually isn’t the first time I’ve built a “Blog Platform”. I built a basic one a few years ago in HTML and PHP. Heck, the system I used int he early 2000s with SHTML Pages was sort of a “Blog Platform”.

I’d recommend, for anyone going through this course, check the notes on the Blog courses.  There are a lot of good suggestions for improvements, especially in stopping things like Java Script injection in the comments.  When you go to Angela’s (The Instructor) Blog link, it immediately throws out a JavaScript pop up that someone dropped in the comments.

Day 70 – Hosting with Heroku

So, there isn’t really a Day 70 project.  I did go through the section, and the most useful part for me was the more robust SQL solution mentioned in the last section.  The first few parts were about GIT and Github, which I am already using.  The middle part was about using Heroku, which I have heard of and used a bit before but for the long term, I don’t need to use a freemium hosting service in a jankey way.  I have a VPS, and later I will figure out how to get Apache to play nice with Flask.

For now, I’ve split the Blog Project into it’s own repository and worked up my initial planned ToDo List.  But I need to keep my focus on the course for now.

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on Pinterest (Opens in new window)

Like this:

Like Loading...
Posted in: 100DaysOfCode Tagged: #100DaysOfCode, Coding, Projects, Python

FreshRSS and RSS Feed Posts

November 3, 2022

Keen observers (ha ha ha no one reads this), might have noticed that a few posts of links showed up in the feed.  These are basically, stories I read in my RSS reader that I found interesting, and wanted to share, or at least, keep track of.  The posts as of now are a little ugly, and I’ll probably clean up the formatting over time, but I wanted to go ahead and write a bit about the process.  I’ll have the Code on Github at some point.

As for the factors, firstly, this is something I’ve wanted to have on my blog for a while.  Like a long while.  I might even try to see if there are ways to better slit up the links by topic later.  A fair number of blogs I subscribe to have these sort of link digest posts, and I’ve always just liked the idea.  It’s also good for personal reference to when I may have read something.  It is limited as it only comes from y RSS Reader.

Speaking of my RSS Reader.  I’ve moved on from TinyTinyRSS, for a few reasons.  One, the interface is a little meh, honestly.  Maybe the newer version is better but it’s only available in Docker, and Docker is such a PItA to use.  Also, while looking for alternatives, it sounds like the folks who make TTRSS are kind of a bunch of gatekeeping jerk types, and I’d rather not support that.  I also find the need to keep the update daemon running with Screen to be a pain.  So I’ve moved over to FreshRSS, which I just run locally on a Raspberry Pi.  I may move it to a publicly accessibly machine at some point, but I am not entirely convinced that TT-RSS wasn’t the entry point for my previous server malware woes.

So, like TT-RSS, Fresh RSS has a way to get an RSS feed out of your Favorited posts.  In the past I’ve used tools like IFTTT to automate posting these links around, but I don’t use IFTTT anymore for reasons I’m not going into.  Fortunately, I’ve been working to become a pretty good Python coder for the last month or so.  So instead I wrote a script.  

It’s not even a particularly complicated script.  There are only two things it really needs to do, get new articles, and then post them to WordPress. Since the script runs locally, on the same Raspberry Pi even, it easily can reach and pull the RSS feed.  One nice thing I noticed with Fresh RSS, the feed included a time interval, so just getting new posts was super simple, because the interval is just “24” for “24 hours”.  The script eventually will run on a cronjob at the exact same time daily.  Anyway, after pulling the RSS, the entries are already in an easily usable Dictionary.  which gets fed into the construction of the WordPress Post.

def get_feed(feed_url):
    NewsFeed = feedparser.parse(feed_url)
    return NewsFeed

The posting part was pretty easy as well, WordPress has an API, and Python also has a library that can use that API.  It just needs some log in information and a post payload to send.  

def make_post(NewsFeed):
    wp = Client(f'https://{wp_url}/xmlrpc.php', wp_user, wp_pass)
    post = WordPressPost()
    post.title = f"{cur_date} - Link List"
    post.terms_names = {'category': ['Link List'], 'post_tag': ['links', 'FreshRSS']}
    post.content = f"<p>Blogging Intensifies Link List for {cur_date}</p>"
    for each in NewsFeed.entries:
        post.content += f'{each.published[5:-15].replace(" ", "-")} - <a href="{each.links[0].href}">{each.title}</a></p>'

The trickiest part was formatting the date a bit prettier.  I mentioned cleaning up the formatting a bit, I’m thinking maybe a simple invisible table, so the date and the links don’t wrap oddly like they do now.   i also added a check that if there are no new favorited posts, it will skip making a post.  Otherwise I’ll end up with empty posts on days I forget to check my feed reader

While writing the script, at first I was just outputting a text copy of the post to the console until satisfied.  Eventually, I pushed out a real post, then verified that things worked.  The next day, was just a straight test by opening the project, then running it again.  The third day, I copied the files and installed the lobraries needed, then posted from the Pi.  Phase 4 of this will be to set up Cron to run it automatically.  If that works then it will certainly, “just run” for the foreseeable future.

Share this:

  • Click to share on Facebook (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on Pinterest (Opens in new window)

Like this:

Like Loading...
Posted in: Coding Tagged: Coding, Projects, Python, RSS
1 2 … 9 Next »

Categories

  • collapsCat options: Array ( [title] => Categories [showPostCount] => 1 [inExclude] => exclude [inExcludeCats] => Photos, Uncategorized, mastodon-feed, goodreads [showPosts] => 0 [showPages] => 0 [linkToCat] => 1 [olderThan] => 0 [excludeAll] => 0 [catSortOrder] => ASC [catSort] => catName [postSortOrder] => ASC [postSort] => postTitle [expand] => 0 [defaultExpand] => Technology, Maker, Coding, Hobbies [debug] => 1 [postTitleLength] => 0 [catfeed] => none [taxonomy] => category [post_type] => post [postDateAppend] => after [postDateFormat] => m/d [showPostDate] => 1 [useCookies] => 1 [postsBeforeCats] => 1 [expandCatPost] => 1 [showEmptyCat] => 1 [showTopLevel] => 1 [useAjax] => 0 [customExpand] => [customCollapse] => [style] => kubrick [accordion] => 1 [title_link] => [addMisc] => 1 [addMiscTitle] => [number] => 2 [includeCatArray] => Array ( ) [expandSym] => ► [collapseSym] => ▼ ) postsToExclude: Array ( ) CATEGORY QUERY RESULTS Array ( [0] => WP_Term Object ( [term_id] => 641 [name] => 100DaysOfCode [slug] => 100daysofcode [term_group] => 0 [term_taxonomy_id] => 641 [taxonomy] => category [description] => [parent] => 172 [count] => 14 [filter] => raw ) [1] => WP_Term Object ( [term_id] => 486 [name] => Advent of Code [slug] => advent-of-code [term_group] => 0 [term_taxonomy_id] => 486 [taxonomy] => category [description] => [parent] => 172 [count] => 27 [filter] => raw ) [2] => WP_Term Object ( [term_id] => 666 [name] => AI Art [slug] => ai-art [term_group] => 0 [term_taxonomy_id] => 666 [taxonomy] => category [description] => [parent] => 153 [count] => 5 [filter] => raw ) [3] => WP_Term Object ( [term_id] => 438 [name] => Books [slug] => books [term_group] => 0 [term_taxonomy_id] => 438 [taxonomy] => category [description] => [parent] => 436 [count] => 4 [filter] => raw ) [4] => WP_Term Object ( [term_id] => 172 [name] => Coding [slug] => programming [term_group] => 0 [term_taxonomy_id] => 172 [taxonomy] => category [description] => [parent] => 153 [count] => 11 [filter] => raw ) [5] => WP_Term Object ( [term_id] => 541 [name] => Concerts [slug] => concertphotos [term_group] => 0 [term_taxonomy_id] => 541 [taxonomy] => category [description] => [parent] => 527 [count] => 7 [filter] => raw ) [6] => WP_Term Object ( [term_id] => 155 [name] => Devices (Phones and Tablets) [slug] => devices [term_group] => 0 [term_taxonomy_id] => 155 [taxonomy] => category [description] => [parent] => 166 [count] => 9 [filter] => raw ) [7] => WP_Term Object ( [term_id] => 622 [name] => Elite Dangerous [slug] => elite-dangerous [term_group] => 0 [term_taxonomy_id] => 622 [taxonomy] => category [description] => [parent] => 523 [count] => 8 [filter] => raw ) [8] => WP_Term Object ( [term_id] => 606 [name] => Fairs [slug] => fairs [term_group] => 0 [term_taxonomy_id] => 606 [taxonomy] => category [description] => [parent] => 527 [count] => 8 [filter] => raw ) [9] => WP_Term Object ( [term_id] => 523 [name] => Feeds [slug] => feeds [term_group] => 0 [term_taxonomy_id] => 523 [taxonomy] => category [description] => [parent] => 0 [count] => 0 [filter] => raw ) [11] => WP_Term Object ( [term_id] => 436 [name] => Hobbies [slug] => hobbies [term_group] => 0 [term_taxonomy_id] => 436 [taxonomy] => category [description] => [parent] => 0 [count] => 0 [filter] => raw ) [12] => WP_Term Object ( [term_id] => 656 [name] => IOT Projects [slug] => iot [term_group] => 0 [term_taxonomy_id] => 656 [taxonomy] => category [description] => [parent] => 153 [count] => 19 [filter] => raw ) [13] => WP_Term Object ( [term_id] => 446 [name] => Language [slug] => language [term_group] => 0 [term_taxonomy_id] => 446 [taxonomy] => category [description] => [parent] => 436 [count] => 1 [filter] => raw ) [14] => WP_Term Object ( [term_id] => 524 [name] => Letterboxed [slug] => letterboxed [term_group] => 0 [term_taxonomy_id] => 524 [taxonomy] => category [description] => [parent] => 523 [count] => 276 [filter] => raw ) [15] => WP_Term Object ( [term_id] => 653 [name] => Link List [slug] => link-list [term_group] => 0 [term_taxonomy_id] => 653 [taxonomy] => category [description] => [parent] => 523 [count] => 29 [filter] => raw ) [16] => WP_Term Object ( [term_id] => 224 [name] => Linux & Open Source [slug] => linux [term_group] => 0 [term_taxonomy_id] => 224 [taxonomy] => category [description] => [parent] => 166 [count] => 6 [filter] => raw ) [17] => WP_Term Object ( [term_id] => 153 [name] => Maker [slug] => maker [term_group] => 0 [term_taxonomy_id] => 153 [taxonomy] => category [description] => [parent] => 0 [count] => 2 [filter] => raw ) [19] => WP_Term Object ( [term_id] => 530 [name] => Micro Blog [slug] => microblog [term_group] => 0 [term_taxonomy_id] => 530 [taxonomy] => category [description] => [parent] => 0 [count] => 55 [filter] => raw ) [20] => WP_Term Object ( [term_id] => 437 [name] => Music [slug] => music [term_group] => 0 [term_taxonomy_id] => 437 [taxonomy] => category [description] => [parent] => 436 [count] => 17 [filter] => raw ) [21] => WP_Term Object ( [term_id] => 395 [name] => My DIY Projects [slug] => my-diy-projects [term_group] => 0 [term_taxonomy_id] => 395 [taxonomy] => category [description] => [parent] => 153 [count] => 7 [filter] => raw ) [22] => WP_Term Object ( [term_id] => 154 [name] => Opinion/Editorial/Life [slug] => articles [term_group] => 0 [term_taxonomy_id] => 154 [taxonomy] => category [description] => [parent] => 0 [count] => 18 [filter] => raw ) [23] => WP_Term Object ( [term_id] => 491 [name] => Organizing [slug] => organizing [term_group] => 0 [term_taxonomy_id] => 491 [taxonomy] => category [description] => [parent] => 436 [count] => 7 [filter] => raw ) [24] => WP_Term Object ( [term_id] => 534 [name] => Other Photos [slug] => otherphotos [term_group] => 0 [term_taxonomy_id] => 534 [taxonomy] => category [description] => [parent] => 527 [count] => 12 [filter] => raw ) [25] => WP_Term Object ( [term_id] => 617 [name] => Outdoor and Nature [slug] => outdoor [term_group] => 0 [term_taxonomy_id] => 617 [taxonomy] => category [description] => [parent] => 527 [count] => 4 [filter] => raw ) [26] => WP_Term Object ( [term_id] => 242 [name] => PC Hardware [slug] => pcs [term_group] => 0 [term_taxonomy_id] => 242 [taxonomy] => category [description] => [parent] => 166 [count] => 6 [filter] => raw ) [28] => WP_Term Object ( [term_id] => 712 [name] => Programming Projects [slug] => projects [term_group] => 0 [term_taxonomy_id] => 712 [taxonomy] => category [description] => [parent] => 172 [count] => 7 [filter] => raw ) [29] => WP_Term Object ( [term_id] => 241 [name] => Synology NAS [slug] => synology-nas [term_group] => 0 [term_taxonomy_id] => 241 [taxonomy] => category [description] => [parent] => 166 [count] => 5 [filter] => raw ) [30] => WP_Term Object ( [term_id] => 166 [name] => Technology [slug] => technology [term_group] => 0 [term_taxonomy_id] => 166 [taxonomy] => category [description] => [parent] => 0 [count] => 9 [filter] => raw ) [31] => WP_Term Object ( [term_id] => 424 [name] => The Basement [slug] => the-basement [term_group] => 0 [term_taxonomy_id] => 424 [taxonomy] => category [description] => [parent] => 153 [count] => 6 [filter] => raw ) [32] => WP_Term Object ( [term_id] => 557 [name] => Toy Photos [slug] => toyphotos [term_group] => 0 [term_taxonomy_id] => 557 [taxonomy] => category [description] => [parent] => 527 [count] => 0 [filter] => raw ) [33] => WP_Term Object ( [term_id] => 1 [name] => Uncategorized [slug] => uncategorized [term_group] => 0 [term_taxonomy_id] => 1 [taxonomy] => category [description] => [parent] => 0 [count] => 0 [filter] => raw ) [34] => WP_Term Object ( [term_id] => 280 [name] => Windows [slug] => windows [term_group] => 0 [term_taxonomy_id] => 280 [taxonomy] => category [description] => [parent] => 166 [count] => 2 [filter] => raw ) [35] => WP_Term Object ( [term_id] => 538 [name] => Zoos [slug] => zoophotos [term_group] => 0 [term_taxonomy_id] => 538 [taxonomy] => category [description] => [parent] => 527 [count] => 12 [filter] => raw ) ) POST QUERY: POST QUERY RESULTS
  • ►Feeds (313)
    • Elite Dangerous (8)
    • Letterboxed (276)
    • Link List (29)
  • ▼Hobbies (29)
    • Books (4)
    • Language (1)
    • Music (17)
    • Organizing (7)
  • ▼Maker (98)
    • AI Art (5)
    • ▼Coding (59)
      • 100DaysOfCode (14)
      • Advent of Code (27)
      • Programming Projects (7)
    • IOT Projects (19)
    • My DIY Projects (7)
    • The Basement (6)
  • ►Micro Blog (55)
  • ►Opinion/Editorial/Life (18)
  • ▼Technology (37)
    • Devices (Phones and Tablets) (9)
    • Linux & Open Source (6)
    • PC Hardware (6)
    • Synology NAS (5)
    • Windows (2)
  • ►Uncategorized (0)

MastodonLinkedIn

emailInstagramInstagram

GitHubLetterboxdDuolongo
GoodreadsLast.fmElite Dangerous INARA
Lameazoid Logo


Copyright © 2023 [Blogging Intensifies].

Me WordPress Theme by themehall.com

%d bloggers like this: