Ramen Junkie

Tracking Covid-19 into a database using Python

At some point I need to do a little write up on my Home Dashboard Project, it’s inspired quite a few minor projects such as this one to make little web widgets. The dashboard is the simple part, it’s just dumping a database query into a table. Honestly, the script was easy too, because I adapted it from another script I built recently.

With COVID-19 all over the news, I wanted to add some stats to my dashboard for my state. Not so much because there aren’t already 1000 other places to get the numbers, but more to see if I could do it. The hardest part was finding a feed to stats. Then I found CovidTracking.com. Which has a nice little API. I then set to work adapting another script to pull from this API to dump stats for Illinois into the database. I am only interested in Illinois, but the script is built so the user can put a list of states into an array, and then it will loop through and add them all to the database.

The script is below, but this also requires some set up in SQL. Nothing complicated, mostly INT fields. an id as an int and primary key, negative_cases, positive_cases, and deaths, all as INT, state as a varchar with a length of 2, though technically the length is optional, then finally date_stamp as a DATETIME field with a default value of the current timestamp. The DATETIME isn’t directly touched here, but it makes it easier to manipulate the data later.

The code also requires you enter your database credentials. I’ve nammed my table “il_covid_stats, but you can change that to whatever you want down below in the “SQL = “INSERT….” line. I’ll leave it up to you what to do with the data, I pull mine into a PHP page.

Anyway, here is the python code:

# Python Covid Star Tracking to SQL
# use of json package
# Sample URL: https://covidtracking.com/api/states?state=IL

import json
import requests
import time
import MySQLdb

mydb = MySQLdb.connect(
  host="localhost",
  user="YOUR_DB_USERNAME",
  passwd="YOUR_DB_PASSWORD",
  database="YOUR_DB_NAME"
)
mycursor = mydb.cursor()
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'

#States to check as an Array, two letter abbreviations
states = ['IL']

def data_getter(statename):
  ####when reading from remote URL
  url = 'https://covidtracking.com/api/states?state='+statename

  user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
  headers = {'User-Agent': user_agent}
  response = requests.get(url,headers=headers)
  html = response.content
  statedata = json.loads(html)

  pos_cases = (statedata['positive'])
  neg_cases = (statedata['negative'])
  deaths = (statedata['death'])

  vals = (pos_cases,neg_cases,deaths,statename)

  mysqlinsert(vals)

def mysqlinsert(vals):
  ## This database name and columns can be changed but should be pre made in your database
  SQL = "INSERT INTO il_covid_stats (positive_cases, negative_cases, deaths, state) VALUES (%s, %s, %s, %s)"
  mycursor.execute(SQL, vals)
  mydb.commit()

# Loop through URLs for each state
for i in states:
  data_getter(i)

Code Project: Network Map Webpage, Making it Better

I wrote a bit about my Network Map Webpage recently. It’s part of a larger home dashboard project I’m working on, but as part of that I’ve updated things a bit to make them more streamlined and easier to use. The biggest problem with the page as it was originally coded is that it shows everything. I’ve cycled most of my regularly used electronics onto the network so they could be captured by an arp scan, though not all of them are on all the time. For example, I still have a Raspbery Pi and Arduino set up to capture temperature data. I also have several Next Thing CHIP devices, though Next Thing has gone out of business. In total, between my IOT stuff and laptops, phones and tablets and the duplicate IPs from the network extender, I have 55 devices in the raw table.

So I set out to make this more manageable at a glance. My original query in my PHP code looked something like this:

SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner ORDER BY ip

By slipping in “WHERE last_seen >= NOW() – INTERVAL 5 MINUTE” just before ORDER BY, I can make the code return only currently connected devices. The ARP scan runs every 5 minutes, anything that has a last seen time stamp within 5 minutes is assumed to still be attached. This interval could be shorted to almost real time, but I don’t really need that much of a check.

I can also view all disconnected devices with a simple change of the above command, making it “WHERE last_seen <= NOW() – INTERVAL 5 MINUTE”. This wouldn’t work if I were still keeping historical data, but I essentially only capture the last seen data for any device. Essentially what this does is return everything not seen in the last 5 minutes.

I also broke out my PHP code that builds my table from my query into it’s own PHP function. This was I could set the variable $SQL for the active devices, call the function to build the table, then set $SQL for inactive devices and build a second table, under the first.

I immediately scrapped this, because it was ugly. Plus, sometimes I do want to see “everything”.

Enter some GET calls and an if/else statement.

	if($_GET['show'] == "active") {
	// SQL for selecting active devices
	$tabletitle="Active Devices";
	$sql = "SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner WHERE last_seen >= NOW() - INTERVAL 5 MINUTE ORDER BY ip";
	}
	elseif($_GET['show'] == "inactive") {
	// SQL for selecting active devices
	$tabletitle="Inactive Devices";
	$sql = "SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner WHERE last_seen <= NOW() - INTERVAL 5 MINUTE ORDER BY ip";
	}
	else {
	// SQL for Selecting all devices
	$tabletitle="All Devices";
	$sql = "SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner ORDER BY ip";
	}

Basically, if nothing, or a random string is passed by the URL variable “show”, then it goes to the end, and displays everything when accessing the page at index.php. If it passes index.php?show=active, it sets $SQL for showing active devices and if it gets index.php?show=inactive, it shows inactive devices. It also sets a variable called $tabletitle which is just echoed out into some header tags. I then added links across the top of the page to each of these filters.

This allows for a quick and easy toggle of which data is pulled and displayed.

Additionally, I updated the way the Add Device form works. Previously, the form would fill in the MAC, a Device Name and a Device Description, then it would POST to another PHP page that would insert the data into the table, then forward on back to the index page with a header redirect. I’m not going to get into too much detail on it here, but I also integrated the Network Map into my dashboard framework with a header, navigation, sidebar, and footer. It also uses a table based navigation system, so in order to view the network map, I am hitting “index.php?page=4”. Pages basically all need to be wrapped in this structure to work properly, so in order to make things flow better, the Add Device form now POSTs back to the Network Map page itself, which checks to see if the POST variables are set, and if they are, it inserts the new information, before pulling the table.

This also meant slightly altering my page calls to look for “index.php?page=4&show=active” and “index.php?page=4&show=inactive”.

Eventually I want to move the Add Device form to appear at the top of the page, so the whole thing is all handled in one single page.

Lastly, I made up a quick block of code in it’s own page, that simple counts and displays the number of currently connected devices on the network. This block is embedded on the front page of my Dashboard Framework and links to the full Network Map page. The general idea on the Dashboard is to have widgets like this that show quick glance information, with links to detailed information.

I have not built a lot of them yet, but one of the others I have built works somewhat similar to the ARP scanning system. A script makes a call to my TT-RSS instance for each of the segmented accounts I have, then dumps the unread count into a table on the server. The widget shows how many unread articles each topic/account has. I am still really bad about only actually reading the Basic feed (mostly Toys and Video Games).

But I will get into the Dashboard Widgets thing a bit more in a future post probably.

My Music Listening Habits for February 2020

Not a whole ton that’s overly exciting this month. I did listen to quite a bit of BT, whom is one of my all time favorite artists, even if my time listening to his stuff comes and goes. I think there was a bit of earlier stuff that I didn’t associate as being BT that I liked, but I remember that Never Gonna Come Back Down was the track that really got me into listening to more of BT’s music. It was on the soundtrack for Gone in 60 Seconds. In the actual movie, they play this really neat super low key remix of Never Gonna Come Back Down that I have never been able to find a copy of anywhere. It occurs when they are stealing the Ferraris from the garage.

Another older album I used to listen to on repeat that came back briefly this month was Rollergirl. Last.fm suggests Rollergirl hasn’t had a release since 2002, which is probably around the time I was listening to this music. Also, apparently Roller girl is some German dude.

Still plenty of Sigrid and Tessa Violet going on. Plus a bit more CHVRCHES this month. I have to say, the more I listen to CHVRCHES, the more I like their stuff. Not that I didn’t like it before, I just, like it more.

Anyway, I’m just gonna close out here with a little Spotify Playlist I made up that collects together the more low key Sigrid tracks available on the service.

Code Project: Network Map Webpage

I want to start off by saying, there isn’t going to be a ton of code here, and if there is code, it’s going to be super dirty. I’m fairly good at making code for “private use” that is pretty insecure, and not so great at code that’s scrubbed up and user friendly to distribute to others.

I’ve been working a bit on some local code projects, specifically for my little private “Dashboard” that runs on my file server. One project I’ve wanted to try for a while is a dynamic network tracker tool. I’ve looked into some options available, and they all seem to run as a plug in for some complicated 3rd party analytics software that often has some goofy complicated set up procedure that’s beyond “apt-get” or even just dumping a bunch of files in a web server directory.

This project is both kind of simple and not. It was fairly simple in set up and execution, but it’s somewhat complex in design. The first job was getting a list of currently connected devices on the network. This is easily done via the command line with an arp-scan request.

sudo arp-scan --localnet

The output of which looks something like this:

Using a pipe, I can shove all of this into a text file, which contains everything above.

sudo arp-scan --localnet | scan.txt

The trick is, how to display this output on a webpage. One way would be to pull it from a database. Pulling data from MySQL is pretty easy, dumping it to a pretty looking table is also easy. The harder part is getting the output of arp-scan to MySQL in a useful manner.

This is where Python comes into play. I am sure there are other methods or languages available, but like Python, and mostly know how to use Python. Essentially, I wrote a script that would open the file, scan.txt, that was created above. I am only concerned with lines that contain IP addresses, so I used the function “is_number()” to check if the first character of each line is numeric, if it is, it runs through a couple of operations.

Firstly, the output of arp-scan is tab delimited, so I can use the “split” function on “\t”, and dump the result into an array. This gives me an array of the IP address, MAC address, and Manufacturer. This sticks a new line in with the Manufacturer, so I did a “replace” on \n in the third item of the list. Lastly, I wanted the IPs to be uniformly formatted, so I write a little function that would add in leading zeros to the IP octets.

Finally, the Python builds an SQL statement from the line’s list, and make a call to the server to insert the values. A modified version of this code that just displays the resulting SQL commands instead of executing them is below.

#!/usr/bin/python

# Open a file

def is_number(s):
        try:
                float(s)
                return True                                                         except ValueError:
                return False

def format_ip(ipstring):
        octets = ipstring.split(".")
        n=0
        for i in octets:
                while(len(i)<3):
                        i=add_zero(i)
                octets[n]=i
                n=n+1
        return octets[0]+"."+octets[1]+"."+octets[2]+"."+octets[3]
        #return ipstring

def add_zero(shortstring):                                                          return "0"+shortstring


import MySQLdb

mydb = MySQLdb.connect(
  host="localhost",
  user="YOURSQLUSERNAME",
  passwd="YOURSQLPASSWORD",
  database="YOURTARGETDATABASE"
)

mycursor = mydb.cursor()

fo = open("scan.txt", "r")
#print ("Name of the file: ", fo.name)

fo.seek(0)

# read each line of the list
for line in fo:
        #check for lines that contain IP addresses
        if is_number(line[0]):                                                              #Convert lines into list
                line_list = line.split("\t")
                #remove line delimitors
                line_list[2]=line_list[2].replace("\n","")
                #Make IP Octets 3 digits
                line_list[0] = format_ip(line_list[0])
                SQL = "INSERT INTO arpscans (ip, mac, mfg) VALUES ("+line_l$                print SQL                                                   
fo.close()

It’s not super pretty, but it was a quick way to make sure everything came out looking correct. The table I used is called “arpscans” and contains columns called, “ip”, “mac”, “mfg”, and “last_seen”. The time stamp is an automatically generated time stamp.

I then created a shell script that would run the arp-scan piped into scan.txt then runt he python script. I set up this script in the root crontab to run once every half hour. Root is required to run the arp-scan command, so my user crontab wouldn’t cut it. Everything ran fine when I manually did a run of the script using sudo. The PHP on the other end out output the latest values based on the time stamp to a webpage.

This is where I ran into my first major hurdle. The script wasn’t running in cron. After a lot of digging and futzing, I found that basically, when cron runs the script, it works off of different environmental variables. I had to specify in ,y bash file, specifically where each command existed. The end result looks something like this:

#!/usr/bin/env bash
/usr/sbin/arp-scan --localnet > /home/ramen/scripts/arp_sql/scan.txt
/usr/bin/python /home/ramen/scripts/arp_sql/arp_post.py

Eventually the scan was running and posting data automatically as expected. After a few days, I ran into my second major issue. There was simply put, way too much data for my crappy old “server” to handle. The webpage slowed to a crawl as the table contained something like 9000+ entries. It’s possible and likely that my query was also rubbish, but rather than stress more figuring it out, I modified all of the code again.

Instead of adding a new entry for every MAC address every scan, I changed it to check if there already was an entry, and simply update the last_seen time. I had originally designed the system with the idea of getting legacy data for attached devices, but decided I only really cared about a generic history.

The new webpage table now displays all devices, current and previously seen, with the last seen date.

A few issues came up on the output end as well, though none of them were super hard to correct. One, I wanted a way to sort the table by clicking the headers. There are several scripts you can toss in your code to do this online.

I also wanted more data about each device, so I added a form where I could fill in more data about each device. Specifically, the network name, if there was one, a description of what the device is, the User of the device (which family member or if it’s just a network device). This also checks and updates based on MAC address.

I also ran into an issue with MAC addresses and my Network extender. When devices are connected to the Network Extender, the first half of the MAC is replaced with the first part of the Extender’s MAC, though they retain the last half. I may eventually write some code to detect and merge these entries, but for now, I’ve simply been labeling them in the description as “(Extender)”, so I know it’s the same device on the other connection.

The final end result looks something like this:

I used to have the network super organized before I moved, but the new router doesn’t work nicely with my Pi DHCP server, so I have not gotten things quite as nicely sorted as I would like. Everything in the picture is sorted, but above .100, it’s a mess. I also can’t assign IPs to some devices at all, like the DirecTV gear or my Amazon Echos, which is really annoying.

One of my future projects will hopefully correct this, as I want to put a second router on the network with DD-WRT, between the ISP gateway and everything else.

Overall, it’s been a fun little exercise in coding that combined a lot different techniques together in a fun way.

A Pile of Used Tech

I recently had an idea occur to me that I might be able to pick up used Raspberry Pis off of eBay more affordably than buying them new. I didn’t really find a ton of savings, but I did pick up an auction for a lot of various parts for fairly cheap.

I am not sure what I’m going to do with all of this, but it seemed like a deal for around $50. I was worried that it wouldn’t all be included, but it was. Not everything is what I had hoped though. Two of the three Raspberry Pi 2s seem to be dead. I’ve tried several trouble shooting methods so far. They turn on, but don’t seem to real SD cards at all.

The arduino is a genuine Arduino, which is nice, but its a fairly older model. Not a huge issue, but it is what it is. The screens were a nice bonus. I’ve been looking into getting a screen of some sort of my Pis, possibly for a RetroPi handheld build. I have not tested the larger screen yet, it seems to work off of a funky daisy chain of an extra board and some cables. I did get the smaller screen working… ish.

It’s a nice little touch screen that fits nicely on top of the Pi. I have not had a chance to properly troubleshoot it, but the touch works kind of funky on it. For one, it seems to function more like a track pad than a straight touch. Two, the mouse cursor only wants to move along a diagonal axis across the screen.

This all kind of feels like a configuration issue however, so there is some hope. Plus I am not sure I really need a touch interface for a RetroPi handheld build.

There’s some other fun stuff that I have not had a chance to mess with yet. There were a ton of ultra sonic sensors. I’m not sure what exactly these could be useful for, but I am wondering if they would be able to do 3D imaging of an object or a space.

There’s some funky board with a digital display on it that seems to be some sort of power board. I am not sure I’m going to have a use for this at all.

Lastly, there is a Raspberry Pi camera module. I have not had a chance to test it out yet, but like the screen, this was something I’ve been wanting to try out.