I want to start off by saying, there isn’t going to be a ton of code here, and if there is code, it’s going to be super dirty. I’m fairly good at making code for “private use” that is pretty insecure, and not so great at code that’s scrubbed up and user friendly to distribute to others.

I’ve been working a bit on some local code projects, specifically for my little private “Dashboard” that runs on my file server. One project I’ve wanted to try for a while is a dynamic network tracker tool. I’ve looked into some options available, and they all seem to run as a plug in for some complicated 3rd party analytics software that often has some goofy complicated set up procedure that’s beyond “apt-get” or even just dumping a bunch of files in a web server directory.

This project is both kind of simple and not. It was fairly simple in set up and execution, but it’s somewhat complex in design. The first job was getting a list of currently connected devices on the network. This is easily done via the command line with an arp-scan request.

sudo arp-scan --localnet

The output of which looks something like this:

Using a pipe, I can shove all of this into a text file, which contains everything above.

sudo arp-scan --localnet | scan.txt

The trick is, how to display this output on a webpage. One way would be to pull it from a database. Pulling data from MySQL is pretty easy, dumping it to a pretty looking table is also easy. The harder part is getting the output of arp-scan to MySQL in a useful manner.

This is where Python comes into play. I am sure there are other methods or languages available, but like Python, and mostly know how to use Python. Essentially, I wrote a script that would open the file, scan.txt, that was created above. I am only concerned with lines that contain IP addresses, so I used the function “is_number()” to check if the first character of each line is numeric, if it is, it runs through a couple of operations.

Firstly, the output of arp-scan is tab delimited, so I can use the “split” function on “\t”, and dump the result into an array. This gives me an array of the IP address, MAC address, and Manufacturer. This sticks a new line in with the Manufacturer, so I did a “replace” on \n in the third item of the list. Lastly, I wanted the IPs to be uniformly formatted, so I write a little function that would add in leading zeros to the IP octets.

Finally, the Python builds an SQL statement from the line’s list, and make a call to the server to insert the values. A modified version of this code that just displays the resulting SQL commands instead of executing them is below.

#!/usr/bin/python

# Open a file

def is_number(s):
        try:
                float(s)
                return True                                                         except ValueError:
                return False

def format_ip(ipstring):
        octets = ipstring.split(".")
        n=0
        for i in octets:
                while(len(i)<3):
                        i=add_zero(i)
                octets[n]=i
                n=n+1
        return octets[0]+"."+octets[1]+"."+octets[2]+"."+octets[3]
        #return ipstring

def add_zero(shortstring):                                                          return "0"+shortstring


import MySQLdb

mydb = MySQLdb.connect(
  host="localhost",
  user="YOURSQLUSERNAME",
  passwd="YOURSQLPASSWORD",
  database="YOURTARGETDATABASE"
)

mycursor = mydb.cursor()

fo = open("scan.txt", "r")
#print ("Name of the file: ", fo.name)

fo.seek(0)

# read each line of the list
for line in fo:
        #check for lines that contain IP addresses
        if is_number(line[0]):                                                              #Convert lines into list
                line_list = line.split("\t")
                #remove line delimitors
                line_list[2]=line_list[2].replace("\n","")
                #Make IP Octets 3 digits
                line_list[0] = format_ip(line_list[0])
                SQL = "INSERT INTO arpscans (ip, mac, mfg) VALUES ("+line_l$                print SQL                                                   
fo.close()

It’s not super pretty, but it was a quick way to make sure everything came out looking correct. The table I used is called “arpscans” and contains columns called, “ip”, “mac”, “mfg”, and “last_seen”. The time stamp is an automatically generated time stamp.

I then created a shell script that would run the arp-scan piped into scan.txt then runt he python script. I set up this script in the root crontab to run once every half hour. Root is required to run the arp-scan command, so my user crontab wouldn’t cut it. Everything ran fine when I manually did a run of the script using sudo. The PHP on the other end out output the latest values based on the time stamp to a webpage.

This is where I ran into my first major hurdle. The script wasn’t running in cron. After a lot of digging and futzing, I found that basically, when cron runs the script, it works off of different environmental variables. I had to specify in ,y bash file, specifically where each command existed. The end result looks something like this:

#!/usr/bin/env bash
/usr/sbin/arp-scan --localnet > /home/ramen/scripts/arp_sql/scan.txt
/usr/bin/python /home/ramen/scripts/arp_sql/arp_post.py

Eventually the scan was running and posting data automatically as expected. After a few days, I ran into my second major issue. There was simply put, way too much data for my crappy old “server” to handle. The webpage slowed to a crawl as the table contained something like 9000+ entries. It’s possible and likely that my query was also rubbish, but rather than stress more figuring it out, I modified all of the code again.

Instead of adding a new entry for every MAC address every scan, I changed it to check if there already was an entry, and simply update the last_seen time. I had originally designed the system with the idea of getting legacy data for attached devices, but decided I only really cared about a generic history.

The new webpage table now displays all devices, current and previously seen, with the last seen date.

A few issues came up on the output end as well, though none of them were super hard to correct. One, I wanted a way to sort the table by clicking the headers. There are several scripts you can toss in your code to do this online.

I also wanted more data about each device, so I added a form where I could fill in more data about each device. Specifically, the network name, if there was one, a description of what the device is, the User of the device (which family member or if it’s just a network device). This also checks and updates based on MAC address.

I also ran into an issue with MAC addresses and my Network extender. When devices are connected to the Network Extender, the first half of the MAC is replaced with the first part of the Extender’s MAC, though they retain the last half. I may eventually write some code to detect and merge these entries, but for now, I’ve simply been labeling them in the description as “(Extender)”, so I know it’s the same device on the other connection.

The final end result looks something like this:

I used to have the network super organized before I moved, but the new router doesn’t work nicely with my Pi DHCP server, so I have not gotten things quite as nicely sorted as I would like. Everything in the picture is sorted, but above .100, it’s a mess. I also can’t assign IPs to some devices at all, like the DirecTV gear or my Amazon Echos, which is really annoying.

One of my future projects will hopefully correct this, as I want to put a second router on the network with DD-WRT, between the ISP gateway and everything else.

Overall, it’s been a fun little exercise in coding that combined a lot different techniques together in a fun way.