It’s kind of funny how one post can lead to another sometimes. This one is pretty basic but it also just shows a bit how useful I find knowing my way around computer systems to be. Yesterday I posted about my little annual music playlists. And as part of that, I wanted to actually post the playlist. I am pretty sure there is a fairly universal “playlist file type” out there and being open source, I had assumed that VLC on my phone stored the playlists somewhere in playlist files.
That assumption was wrong, it uses a .db file. A little portable database. There is an option to dump this file to the root of the phone, presumably for backup purposes, but it’s also useful to just browse it like I am doing here. The file itself can be opened and browsed with SQL Lite’s DB manager. It’s standard databases inside for tracks and artists and playlists.
Fortunately, I have had some experience dealing with database queries, so I set about building what was needed tog et the data I wanted. Pull the Playlist I want, in this case “2023 Best” but I could change that to do any available Playlist. This gives the tracks by id, but the tracks themselves are stored in a separate table for media. So that needs joined in. The media table stores track names, but not artist names, so an additional join is needed to get the artist names. This complicated things a bit because both the playlist table and artist table have a column “name” so more clarity needed to be added.
The result was this little query that dumps out a basic table of Artist and Song title.
SELECT Artist.name, Media.title
Inner Join playlistmediarelation ON playlist_id=id_playlist
Inner Join Media ON id_media=media_id
Inner Join Artist ON media.artist_id=Artist.id_artist
WHERE Playlist.name = '2023 Best'
ORDER BY Artist.name
Now, I could have done some cute clever trick now to merge the two into a new column and add in a ” – ” between but it was easier to drop it all into a notepad file and do a fine/replace on the weird space character that it stick in between the Artist and track title.
The added bonus here is I can easily use this query again anytime I want to dump a Playlist to text.
This is one of those quick and kind of dirty projects I’ve been meaning to do for a while. Basically, I wanted a script that would scrape all of the top level comments from a Reddit post and push them out to a list. Most commonly, to use on /r/AskReddit style threads like, well, for this example, “What is a song from the 90s that young people should listen to.”
Basically, threads that ask for useful opinions on list. Sometimes it’s lists of websites or something. Often it’s music. The script here is made for music but could be adjusted for any thread. Here is the script, I’ll touch on it a bit in more detail after.
## Create an APP for Secrets here:
## Thread to scrape goes here, replace the one below
url = "https://www.reddit.com/r/Music/comments/10c4ki0/name_one_90s_song_kids_born_after_2000_should_add/"
## Fill in API Information here
reddit = praw.Reddit(
user_agent= "script by u/", # Your Username, not really required though
submission = reddit.submission(url=url)
submission.comment_limit = 1
for x in submission.comments:
with open("output.txt", mode="a", encoding="UTF-8") as file:
if "-" in x.body:
The script uses praw, Python Reddit API Wrapper. A Library made for use in Python and the Reddit API. It requires free keys which can be gotten here: https://www.reddit.com/prefs/apps. Just create an app, the Client ID is a jumble of letters under the name, the secret is labeled. User Agent can be whatever really, but it’s meant to be informative.
The thread URL also needs filled in.
The script then pulls the thread data and pulls the top level comments.
I’m interested in text file lists mostly, though for the sake of music based lists, if I used Spotify, I might combine it with the Spotify Playlist maker from my 100 Days of Python course. Like I said before though, this script is made for pulling music suggestions, with this but of code:
if "-" in x.body:
It’s simple, but if the comment contains a dash, as in “Taylor Swift – Shake it Off” or “ACDC – Back in Black”, it writes it to the file. Otherwise it discards it. There is a chance it means discarding some submissions, but this isn’t precision work so I’m OK with that to filter out the chaff. If I were looking for URLs or something, I might look for “http” in the comment. I could also eliminate the “if” statement and just have it write all the comments to a file.
I want to start off by saying, there isn’t going to be a ton of code here, and if there is code, it’s going to be super dirty. I’m fairly good at making code for “private use” that is pretty insecure, and not so great at code that’s scrubbed up and user friendly to distribute to others.
I’ve been working a bit on some local code projects, specifically for my little private “Dashboard” that runs on my file server. One project I’ve wanted to try for a while is a dynamic network tracker tool. I’ve looked into some options available, and they all seem to run as a plug in for some complicated 3rd party analytics software that often has some goofy complicated set up procedure that’s beyond “apt-get” or even just dumping a bunch of files in a web server directory.
This project is both kind of simple and not. It was fairly simple in set up and execution, but it’s somewhat complex in design. The first job was getting a list of currently connected devices on the network. This is easily done via the command line with an arp-scan request.
sudo arp-scan --localnet
The output of which looks something like this:
Using a pipe, I can shove all of this into a text file, which contains everything above.
sudo arp-scan --localnet | scan.txt
The trick is, how to display this output on a webpage. One way would be to pull it from a database. Pulling data from MySQL is pretty easy, dumping it to a pretty looking table is also easy. The harder part is getting the output of arp-scan to MySQL in a useful manner.
This is where Python comes into play. I am sure there are other methods or languages available, but like Python, and mostly know how to use Python. Essentially, I wrote a script that would open the file, scan.txt, that was created above. I am only concerned with lines that contain IP addresses, so I used the function “is_number()” to check if the first character of each line is numeric, if it is, it runs through a couple of operations.
Firstly, the output of arp-scan is tab delimited, so I can use the “split” function on “\t”, and dump the result into an array. This gives me an array of the IP address, MAC address, and Manufacturer. This sticks a new line in with the Manufacturer, so I did a “replace” on \n in the third item of the list. Lastly, I wanted the IPs to be uniformly formatted, so I write a little function that would add in leading zeros to the IP octets.
Finally, the Python builds an SQL statement from the line’s list, and make a call to the server to insert the values. A modified version of this code that just displays the resulting SQL commands instead of executing them is below.
# Open a file
return True except ValueError:
octets = ipstring.split(".")
for i in octets:
def add_zero(shortstring): return "0"+shortstring
mydb = MySQLdb.connect(
mycursor = mydb.cursor()
fo = open("scan.txt", "r")
#print ("Name of the file: ", fo.name)
# read each line of the list
for line in fo:
#check for lines that contain IP addresses
if is_number(line): #Convert lines into list
line_list = line.split("\t")
#remove line delimitors
#Make IP Octets 3 digits
line_list = format_ip(line_list)
SQL = "INSERT INTO arpscans (ip, mac, mfg) VALUES ("+line_l$ print SQL
It’s not super pretty, but it was a quick way to make sure everything came out looking correct. The table I used is called “arpscans” and contains columns called, “ip”, “mac”, “mfg”, and “last_seen”. The time stamp is an automatically generated time stamp.
I then created a shell script that would run the arp-scan piped into scan.txt then runt he python script. I set up this script in the root crontab to run once every half hour. Root is required to run the arp-scan command, so my user crontab wouldn’t cut it. Everything ran fine when I manually did a run of the script using sudo. The PHP on the other end out output the latest values based on the time stamp to a webpage.
This is where I ran into my first major hurdle. The script wasn’t running in cron. After a lot of digging and futzing, I found that basically, when cron runs the script, it works off of different environmental variables. I had to specify in ,y bash file, specifically where each command existed. The end result looks something like this:
Eventually the scan was running and posting data automatically as expected. After a few days, I ran into my second major issue. There was simply put, way too much data for my crappy old “server” to handle. The webpage slowed to a crawl as the table contained something like 9000+ entries. It’s possible and likely that my query was also rubbish, but rather than stress more figuring it out, I modified all of the code again.
Instead of adding a new entry for every MAC address every scan, I changed it to check if there already was an entry, and simply update the last_seen time. I had originally designed the system with the idea of getting legacy data for attached devices, but decided I only really cared about a generic history.
The new webpage table now displays all devices, current and previously seen, with the last seen date.
A few issues came up on the output end as well, though none of them were super hard to correct. One, I wanted a way to sort the table by clicking the headers. There are several scripts you can toss in your code to do this online.
I also wanted more data about each device, so I added a form where I could fill in more data about each device. Specifically, the network name, if there was one, a description of what the device is, the User of the device (which family member or if it’s just a network device). This also checks and updates based on MAC address.
I also ran into an issue with MAC addresses and my Network extender. When devices are connected to the Network Extender, the first half of the MAC is replaced with the first part of the Extender’s MAC, though they retain the last half. I may eventually write some code to detect and merge these entries, but for now, I’ve simply been labeling them in the description as “(Extender)”, so I know it’s the same device on the other connection.
The final end result looks something like this:
I used to have the network super organized before I moved, but the new router doesn’t work nicely with my Pi DHCP server, so I have not gotten things quite as nicely sorted as I would like. Everything in the picture is sorted, but above .100, it’s a mess. I also can’t assign IPs to some devices at all, like the DirecTV gear or my Amazon Echos, which is really annoying.
One of my future projects will hopefully correct this, as I want to put a second router on the network with DD-WRT, between the ISP gateway and everything else.
Overall, it’s been a fun little exercise in coding that combined a lot different techniques together in a fun way.