Code Project: Network Map Webpage, Making it Better

I wrote a bit about my Network Map Webpage recently. It’s part of a larger home dashboard project I’m working on, but as part of that I’ve updated things a bit to make them more streamlined and easier to use. The biggest problem with the page as it was originally coded is that it shows everything. I’ve cycled most of my regularly used electronics onto the network so they could be captured by an arp scan, though not all of them are on all the time. For example, I still have a Raspbery Pi and Arduino set up to capture temperature data. I also have several Next Thing CHIP devices, though Next Thing has gone out of business. In total, between my IOT stuff and laptops, phones and tablets and the duplicate IPs from the network extender, I have 55 devices in the raw table.

So I set out to make this more manageable at a glance. My original query in my PHP code looked something like this:

SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner ORDER BY ip

By slipping in “WHERE last_seen >= NOW() – INTERVAL 5 MINUTE” just before ORDER BY, I can make the code return only currently connected devices. The ARP scan runs every 5 minutes, anything that has a last seen time stamp within 5 minutes is assumed to still be attached. This interval could be shorted to almost real time, but I don’t really need that much of a check.

I can also view all disconnected devices with a simple change of the above command, making it “WHERE last_seen <= NOW() – INTERVAL 5 MINUTE”. This wouldn’t work if I were still keeping historical data, but I essentially only capture the last seen data for any device. Essentially what this does is return everything not seen in the last 5 minutes.

I also broke out my PHP code that builds my table from my query into it’s own PHP function. This was I could set the variable $SQL for the active devices, call the function to build the table, then set $SQL for inactive devices and build a second table, under the first.

I immediately scrapped this, because it was ugly. Plus, sometimes I do want to see “everything”.

Enter some GET calls and an if/else statement.

	if($_GET['show'] == "active") {
	// SQL for selecting active devices
	$tabletitle="Active Devices";
	$sql = "SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner WHERE last_seen >= NOW() - INTERVAL 5 MINUTE ORDER BY ip";
	}
	elseif($_GET['show'] == "inactive") {
	// SQL for selecting active devices
	$tabletitle="Inactive Devices";
	$sql = "SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner WHERE last_seen <= NOW() - INTERVAL 5 MINUTE ORDER BY ip";
	}
	else {
	// SQL for Selecting all devices
	$tabletitle="All Devices";
	$sql = "SELECT ip, arpscans.mac, arpscans_known_macs.device_name, arpscans_known_macs.device_description, last_seen, device_owners.user_name FROM arpscans LEFT JOIN arpscans_known_macs on arpscans_known_macs.mac = arpscans.mac LEFT JOIN device_owners on device_owners.id = arpscans_known_macs.device_owner ORDER BY ip";
	}

Basically, if nothing, or a random string is passed by the URL variable “show”, then it goes to the end, and displays everything when accessing the page at index.php. If it passes index.php?show=active, it sets $SQL for showing active devices and if it gets index.php?show=inactive, it shows inactive devices. It also sets a variable called $tabletitle which is just echoed out into some header tags. I then added links across the top of the page to each of these filters.

This allows for a quick and easy toggle of which data is pulled and displayed.

Additionally, I updated the way the Add Device form works. Previously, the form would fill in the MAC, a Device Name and a Device Description, then it would POST to another PHP page that would insert the data into the table, then forward on back to the index page with a header redirect. I’m not going to get into too much detail on it here, but I also integrated the Network Map into my dashboard framework with a header, navigation, sidebar, and footer. It also uses a table based navigation system, so in order to view the network map, I am hitting “index.php?page=4”. Pages basically all need to be wrapped in this structure to work properly, so in order to make things flow better, the Add Device form now POSTs back to the Network Map page itself, which checks to see if the POST variables are set, and if they are, it inserts the new information, before pulling the table.

This also meant slightly altering my page calls to look for “index.php?page=4&show=active” and “index.php?page=4&show=inactive”.

Eventually I want to move the Add Device form to appear at the top of the page, so the whole thing is all handled in one single page.

Lastly, I made up a quick block of code in it’s own page, that simple counts and displays the number of currently connected devices on the network. This block is embedded on the front page of my Dashboard Framework and links to the full Network Map page. The general idea on the Dashboard is to have widgets like this that show quick glance information, with links to detailed information.

I have not built a lot of them yet, but one of the others I have built works somewhat similar to the ARP scanning system. A script makes a call to my TT-RSS instance for each of the segmented accounts I have, then dumps the unread count into a table on the server. The widget shows how many unread articles each topic/account has. I am still really bad about only actually reading the Basic feed (mostly Toys and Video Games).

But I will get into the Dashboard Widgets thing a bit more in a future post probably.

Code Project: Network Map Webpage

I want to start off by saying, there isn’t going to be a ton of code here, and if there is code, it’s going to be super dirty. I’m fairly good at making code for “private use” that is pretty insecure, and not so great at code that’s scrubbed up and user friendly to distribute to others.

I’ve been working a bit on some local code projects, specifically for my little private “Dashboard” that runs on my file server. One project I’ve wanted to try for a while is a dynamic network tracker tool. I’ve looked into some options available, and they all seem to run as a plug in for some complicated 3rd party analytics software that often has some goofy complicated set up procedure that’s beyond “apt-get” or even just dumping a bunch of files in a web server directory.

This project is both kind of simple and not. It was fairly simple in set up and execution, but it’s somewhat complex in design. The first job was getting a list of currently connected devices on the network. This is easily done via the command line with an arp-scan request.

sudo arp-scan --localnet

The output of which looks something like this:

Using a pipe, I can shove all of this into a text file, which contains everything above.

sudo arp-scan --localnet | scan.txt

The trick is, how to display this output on a webpage. One way would be to pull it from a database. Pulling data from MySQL is pretty easy, dumping it to a pretty looking table is also easy. The harder part is getting the output of arp-scan to MySQL in a useful manner.

This is where Python comes into play. I am sure there are other methods or languages available, but like Python, and mostly know how to use Python. Essentially, I wrote a script that would open the file, scan.txt, that was created above. I am only concerned with lines that contain IP addresses, so I used the function “is_number()” to check if the first character of each line is numeric, if it is, it runs through a couple of operations.

Firstly, the output of arp-scan is tab delimited, so I can use the “split” function on “\t”, and dump the result into an array. This gives me an array of the IP address, MAC address, and Manufacturer. This sticks a new line in with the Manufacturer, so I did a “replace” on \n in the third item of the list. Lastly, I wanted the IPs to be uniformly formatted, so I write a little function that would add in leading zeros to the IP octets.

Finally, the Python builds an SQL statement from the line’s list, and make a call to the server to insert the values. A modified version of this code that just displays the resulting SQL commands instead of executing them is below.

#!/usr/bin/python

# Open a file

def is_number(s):
        try:
                float(s)
                return True                                                         except ValueError:
                return False

def format_ip(ipstring):
        octets = ipstring.split(".")
        n=0
        for i in octets:
                while(len(i)<3):
                        i=add_zero(i)
                octets[n]=i
                n=n+1
        return octets[0]+"."+octets[1]+"."+octets[2]+"."+octets[3]
        #return ipstring

def add_zero(shortstring):                                                          return "0"+shortstring


import MySQLdb

mydb = MySQLdb.connect(
  host="localhost",
  user="YOURSQLUSERNAME",
  passwd="YOURSQLPASSWORD",
  database="YOURTARGETDATABASE"
)

mycursor = mydb.cursor()

fo = open("scan.txt", "r")
#print ("Name of the file: ", fo.name)

fo.seek(0)

# read each line of the list
for line in fo:
        #check for lines that contain IP addresses
        if is_number(line[0]):                                                              #Convert lines into list
                line_list = line.split("\t")
                #remove line delimitors
                line_list[2]=line_list[2].replace("\n","")
                #Make IP Octets 3 digits
                line_list[0] = format_ip(line_list[0])
                SQL = "INSERT INTO arpscans (ip, mac, mfg) VALUES ("+line_l$                print SQL                                                   
fo.close()

It’s not super pretty, but it was a quick way to make sure everything came out looking correct. The table I used is called “arpscans” and contains columns called, “ip”, “mac”, “mfg”, and “last_seen”. The time stamp is an automatically generated time stamp.

I then created a shell script that would run the arp-scan piped into scan.txt then runt he python script. I set up this script in the root crontab to run once every half hour. Root is required to run the arp-scan command, so my user crontab wouldn’t cut it. Everything ran fine when I manually did a run of the script using sudo. The PHP on the other end out output the latest values based on the time stamp to a webpage.

This is where I ran into my first major hurdle. The script wasn’t running in cron. After a lot of digging and futzing, I found that basically, when cron runs the script, it works off of different environmental variables. I had to specify in ,y bash file, specifically where each command existed. The end result looks something like this:

#!/usr/bin/env bash
/usr/sbin/arp-scan --localnet > /home/ramen/scripts/arp_sql/scan.txt
/usr/bin/python /home/ramen/scripts/arp_sql/arp_post.py

Eventually the scan was running and posting data automatically as expected. After a few days, I ran into my second major issue. There was simply put, way too much data for my crappy old “server” to handle. The webpage slowed to a crawl as the table contained something like 9000+ entries. It’s possible and likely that my query was also rubbish, but rather than stress more figuring it out, I modified all of the code again.

Instead of adding a new entry for every MAC address every scan, I changed it to check if there already was an entry, and simply update the last_seen time. I had originally designed the system with the idea of getting legacy data for attached devices, but decided I only really cared about a generic history.

The new webpage table now displays all devices, current and previously seen, with the last seen date.

A few issues came up on the output end as well, though none of them were super hard to correct. One, I wanted a way to sort the table by clicking the headers. There are several scripts you can toss in your code to do this online.

I also wanted more data about each device, so I added a form where I could fill in more data about each device. Specifically, the network name, if there was one, a description of what the device is, the User of the device (which family member or if it’s just a network device). This also checks and updates based on MAC address.

I also ran into an issue with MAC addresses and my Network extender. When devices are connected to the Network Extender, the first half of the MAC is replaced with the first part of the Extender’s MAC, though they retain the last half. I may eventually write some code to detect and merge these entries, but for now, I’ve simply been labeling them in the description as “(Extender)”, so I know it’s the same device on the other connection.

The final end result looks something like this:

I used to have the network super organized before I moved, but the new router doesn’t work nicely with my Pi DHCP server, so I have not gotten things quite as nicely sorted as I would like. Everything in the picture is sorted, but above .100, it’s a mess. I also can’t assign IPs to some devices at all, like the DirecTV gear or my Amazon Echos, which is really annoying.

One of my future projects will hopefully correct this, as I want to put a second router on the network with DD-WRT, between the ISP gateway and everything else.

Overall, it’s been a fun little exercise in coding that combined a lot different techniques together in a fun way.

Next Thing CHiP as a Twitter Bot

twitter-logoThere was a post that came across on Medium recently, How to Make a Twitter Bot in Under an Hour.  It’s pretty straight forward, though it seems to be pretty geared towards non “techie” types, mostly because it’s geared towards people making the bot on a Mac and it uses something called Heroku to run the bot.  Heroku seems alright, except that this sort of feels like an abuse of their free tier, and it’s not free for any real projects.

I already have a bunch of IOT stuff floating around that’s ideal for running periodic services.  I also have a VPS is I really wanted something dedicated.  So I adapted the article for use in a standard Linux environment.  I used one of my CHiPs but this should work on a Raspberry Pi, an Ubuntu box, a VPS, or pretty much anything running Linux.

The first part of the article is needed, set up a new Twitter account, or use one you already have if you have extras.  Go to apps.twitter.com, create an app and keys, keep it handy.

Install git and python and python’s twitter extension.

sudo apt-get install git

sudo apt-get install python-twitter

This should set up everything we’ll need later.  Once it’s done, close the repository.

git clone https://github.com/tommeagher/heroku_ebooks.git

This should download the repository and it’s files.  Next it’s time to set up the configuration files.

cd heroku_ebooks

cp local_settings_example.py local_settings.py

pico local_settings.py

This should open up an editor with the settings file open.  It’s pretty straight forwards, you’ll need to copy and paste the keys from Twitter into the file, there are 4 of them total, make sure you don’t leave any extra spaces inside the single quotes.  You’ll also need to add one or more accounts for the bot to model itself after.  You’ll also need to change DEBUG = TRUE to DEBUG = FALSE as well as adding your bot’s username to the TWEET_ACCOUNT=” entry at the bottom.

Once that is all done do a Control+O to write out the file and Control+X to exit.  Now it’s time to test out the bot with the following…

python ebooks.py

It may pause for a second while it does it’s magic.  If you get the message ” No, sorry, not this time.” it means the bot decided not to tweet, just run the command again until it tweets, since we’re testing it at the moment.  If it worked, it should print a tweet to the command line and the tweet should show up in the bot’s timeline.  If you get some errors, you may need to do some searching and troubleshooting, and double check the settings file.

Next we need to automate the Twitter Bot Tweets.  This is done using Linux’s built in cron.  But first we need to make our script executable.

 chmod 755 ebooks.py

Next, enter the following….

sudo crontab -e

Then select the default option, which should be nano.  This will open the cron scheduler file.  You’ll want to schedule the bot to run according to whatever schedule you want.  Follow the columns above as a guide.  For example:

# m h  dom mon dow   command

*/15 * * * * python /home/chip/heroku_ebooks/ebooks.py

m = minutes = */15 = every 15 minutes of an hour (0, 15, 30, 45)

h = hour = * (every hour)

dom = day of month = * = every day and so on.  The command to run, in this case, is “python /home/chip/heroku_ebooks/ebooks.py”.  If you’re running this on a Raspberry Pi, or your own server, you will need to change “chip” to be the username who’s directory has the files.  Or, if you want to put the files elsewhere, it just needs to b e the path to the files.  For example, on a Raspberry Pi, it would be “python /home/pi/heroku_ebooks/ebooks.py”.

If everything works out, the bot should tweet on schedule as long as the CHIP is powered on and connected.  Remember, by default the bot only tweets 1/8th of the time when the script is run (this can be adjusted in the settings file), so you may not see it tweet immediately.

This is also a pretty low overhead operation, you could conceivably run several Twitter Bots on one small IOT device, with a staggered schedule even.  Simply copy the heruko_ebooks directory to a new directory, change the keys and account names and set up a new cron job pointing to the new directory.

Learning Python with Udacity

udacity_cs101

Just a note, this is not any sort of advertisement…

So I know some basic programming syntax, generally centered around C and C++ which I learned in college.  The C was through several Engineering based classes and the C++ was from a single Computer Science course I took when I had a semester to fill before transferring schools and didn’t want to completely lapse on the studying, schooling lifestyle.  I also know how to code HTML but that is barely programming by any stretch. 

I have tried various self taught methods to teach myself more C++ and some Java with little success.  I have some books to make Android apps but I have yet to get anywhere with them.  Then, I believe through the Windows Weekly podcast, I found out about this deal called Udacity. The first course offering is to learn how to code a basic Search Engine using Python.  I’ve found it pretty well designed though a handful of the examples were a little too abstract to be meaningful (I’m looking at the one about cost and RAM and memory and compute cycles which I still don’t understand).

Anyway, I’m done three out of the seven modules and I’m rather proud of the fact that I’ve actually managed to stick with it and learn some things.  I’ve got a little script now that I could use to extract links from any webpage or even a number of webpages, though right now all I know how to do is display them.  Presumably we’ll learn how to compile them into some sort of file or database.  My biggest hurdle really is I keep wanting to use C and C++ syntax.  Things like adding ; at the end of lines or 1++ or variable++.

It’s not a terrible problem really.