Python

Advent of Code 2020 – Day 4

The task for Day 4 is to validate a series of passports by checking that they have valid data on them. For the first part, the passports are simply validated by verifying that they have all of the fields that they should have:

  • byr (Birth Year)
  • iyr (Issue Year)
  • eyr (Expiration Year)
  • hgt (Height)
  • hcl (Hair Color)
  • ecl (Eye Color)
  • pid (Passport ID)
  • cid (Country ID)

For my approach, I first loaded the data from the file (this is a common theme in these), and then went about searching line by line. Here is my code for Part 1:

with open('day4data.txt') as f:
    lines = [line.rstrip() for line in f]

buildline = ''
valid_fields = ['byr','iyr','eyr','hgt','hcl','ecl','pid']
passengers = []
num_fields = len(valid_fields)
#print num_fields
valid_passports=0
num_valid=0

for line in lines:
  if line != '':
    buildline+=' '+line
  else:
    #print buildline
    linearray=buildline.split(' ')
    linearray=linearray[1:]
    for x in linearray:
      #print x[0:3]
      if(x[0:3] in valid_fields):
        num_valid+=1
    #print linearray
    if num_valid == num_fields:
      valid_passports+=1

    #passengers.append(linearray)
    num_valid = 0
    buildline = ''

#print passengers
print valid_passports

Each passport in the file has multiple lines, separated by blank lines. So for each line, it first checks to see if it’s a blank or not. If it’s not a blank, it compiles the line into a single passport line for processing. Once it hits a blank line, it know it has captured all of the data for that particular passport, and processes the data.

The processing is done by splitting the data into an array, then processing through each entry in the array, looking for one of the required fields. It verifies that a passport is valid by checking to see if the number of fields counted is equal to the length of the fields array.

While the array building is sloppy, and probably could have been done by adding the lines directly to an array, instead of a string that’s then split, I am rather proud of a few bits of this code. Comparing the valid field count to the valid fields array means that it would be trivial to add additional valid fields.

Also the check is particularly nice. It would have been easy to do the check using a basic search on each line. The problem is, this could confuse the program if say, “Hair Color” contains invalid data of “byr” (Birthyear). It would think it found the Birthyear field, but it has not. The field is always the first three characters in a valid passport here, so using “if(x[0:3] in valid_fields):” means it always only checks the first 3 characters.

Part 2 added another layer to the validation to make sure the individual fields actually contained properly formatted, valid data. The basic process is the same, it just involved a lot more if statements to check for the various conditions required. My code is below:

with open('day4data.txt') as f:
    lines = [line.rstrip() for line in f]

buildline = ''
valid_fields = ['byr','iyr','eyr','hgt','hcl','ecl','pid']
valid_eyes = ['amb','blu','brn','gry','grn','hzl','oth']
valid_color = ['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f']
passengers = []
num_fields = len(valid_fields)
#print num_fields
valid_passports=0
num_valid=0
is_valid=0

for line in lines:
  if line != '':
    buildline+=' '+line
  else:
    #print buildline
    linearray=buildline.split(' ')
    linearray=linearray[1:]
    for x in linearray:
      if(x[0:3] == "byr" and len(x[4:]) == 4):
        if 1920<=int(x[4:])<=2002:
          num_valid+=1

      if(x[0:3] == "iyr" and len(x[4:]) == 4):
        if 2010<=int(x[4:])<=2020:
          num_valid+=1

      if(x[0:3] == "eyr" and len(x[4:]) == 4):
        if 2020<=int(x[4:])<=2030:
          num_valid+=1

      if(x[0:3] == "ecl"):
        if x[4:] in valid_eyes:
          num_valid+=1

      if(x[0:3] == "pid") and (len(x[4:]) == 9):
          num_valid+=1

      if(x[0:3] == "hgt" and x[-2:] == 'cm'):
          if 150<=int(x[4:-2]) <= 193:
            num_valid+=1
      if(x[0:3] == "hgt" and x[-2:] == 'in'):
          if 59<=int(x[4:-2]) <= 76:
            num_valid+=1

      if(x[0:3] == 'hcl' and len(x[4:]) == 7):
          #print x[4:]
          #for n in x[5:]:
           # if n not in valid_color:
            #  is_valid=0
          num_valid+=1

    if num_valid == num_fields:
      valid_passports+=1

    #passengers.append(linearray)
    is_valid = 0
    num_valid = 0
    buildline = ''

#print passengers
print valid_passports

Not a lot of fanciness here, but it uses the same basic principle of counting the valid fields up and comparing them to the length. The difference being that if you needed to add a required field, you would also need to add a conditional for that statement.

Advent of Code 2020 – Day 3

Day 3 was the first that really presented some problems, and I admit, I did a “cheaty trick” to solve it. Part 1 and 2 use the same code as well, because Part 2 was a simple variation of Part 1.

For Day 3, you get a “map” of a forest with Trees. You have to count how many trees will need to be avoided for a particular path through the forest. The path is a straight line along a particular slope, (over right, down some). here is the sample forest:

..##.......
#...#...#..
.#....#..#.
..#.#...#.#
.#...##..#.
..#.##.....
.#.#.#....#
.#........#
#.##...#...
#...##....#
.#..#...#.#

So, the problem I ran into, is that the hash (#) in Python, creates comments. I am sure there is a way, but I could not figure out how to escape out the hash to compare and count it in an ‘if’ statement. So what I did instead of was replace all the #s in my map code, with ‘T’s. I can look for ‘T’s all day.

With that, here is a my solution:

def split_str(s):
  return [ch for ch in s]

with open('day3datab.txt') as f:
    lines = [line.rstrip() for line in f]

trees = 0
slopex = 1
slopey = 2
posx=0
posy=0
line_loop=len(lines[1])
distance=len(lines)-2
#print distance

while (posy<distance):
  #print trees
  posx=posx+slopex
  posy=posy+slopey
  #print str(posy)+"/"+str(distance)
  if(posx>=line_loop):
    posx=posx-line_loop
  #print str(posx)+","+str(posy)
  print str(lines[posy][posx])
  whatis=str(lines[posy][posx])
  if (whatis == "T"):
    trees+=1
  print trees

The way the problem works is, the forest repeats to the right infinitely. So step one was getting the width of the forest repeat sequence, so I could loop my position back around as needed. Otherwise it’s a simple matter of updating the position of ‘row’ and ‘col’, checking if it needs to loop (is row position greater than the loop, if so subtract the loop value), then check for a T(ree).

Part 2 was the same problem, except you check several slopes and multiply the results together. To solve this I just adjusted my slope variables and then used a calculator to multiply the results.

I could have changed the code to ask for the slope each run, or even have it loop through several slopes until you tell it to end, when it would multiply the results itself, but frankly, I don’t have that kind of time. This is a fun side project to do while watching TV at night.

Advent of Code 2020 – Day 2

Day two is a little more complicated than Day 1 was. Today’s challenge is to take a blob of passwords, and verify if they are acceptable or not, per the “company standard” at the time of the creation of each password.

For example:

1-3 a: abcde
1-3 b: cdefg
2-9 c: ccccccccc

For part 1, on line 1, the letter a should appear at least 1 time and at most 3 times. For line 2, the letter b should appear at least 1 time and at most 3 times. For line 3, the letter c should appear at least 2 times and at most 9 times.

My solution for Day 2, Part 1 is below:

number_valid = 0
toolow = 0
toohigh = 0
number_of_entries = 0

with open('day2data.txt') as f:
    lines = [line.rstrip() for line in f]

for x in lines:
  dashloc = x.find("-")
  spaceloc = x.find(" ")
  colloc = x.find(":")
  mincount=int(x[0:dashloc])
  maxcount=int(x[int(dashloc)+1:spaceloc])
  limitchar=x[int(spaceloc+1):int(colloc)]
  password = x[int(colloc+2):]
  checkvalue = int(password.count(limitchar))
  if(mincount <= checkvalue <= maxcount):
    number_valid=number_valid+1
  if(checkvalue<mincount):
    toolow+=1
  if(checkvalue>maxcount):
    toohigh+=1
  if(mincount>=maxcount):
    print "Problem?"

print number_valid
#print toolow
#print toohigh
#print number_valid+toolow+toohigh

On a note, I’ve got some debugging bits still in there, that are commented out. After reading the file in, I’ve initialized some variables I’ll be using in the code. Then I start looping through the data.

This is where my janky code really gets to shine, as I am sure there is a better way to handle this. First step is to locate the special characters separating the data, from the exampe, “1-3 a:abcde”, these are a dash, a space and a colon. With these located, I’ve extracted the min and max values, the letter I’m tracking, and the password itself.

After getting these core values, it’s simple enough to count the occurrences of the letter, then verify it against the min/max values and count how many are valid.

I had a bit of trouble at first because I had used “password = x[int(colloc+2):-1]” instead of “password = x[int(colloc+2):]”. I’m not a master of a lot of programming languages, but I am familiar with enough of them that I get the syntax confused a lot (I keep forgetting the colon on ifs and loops in Python). I forget which language uses -1 for “go to tot he end of a range”, but I am pretty sure there is one, because I did this a lot at first.

Part 2 changes things up a bit. Instead of the first numbers representing a range of how many, it demands that the special letter appears at one of those two positions, but not both.

My solution for part 2 is below:

number_valid = 0
number_of_entries = 0

with open('day2data.txt') as f:
    lines = [line.rstrip() for line in f]

for x in lines:
  first=0
  second=0
  dashloc = x.find("-")
  spaceloc = x.find(" ")
  colloc = x.find(":")
  mincount=int(x[0:dashloc])-1
  maxcount=int(x[int(dashloc)+1:spaceloc])-1
  limitchar=x[int(spaceloc+1):int(colloc)]
  password = x[int(colloc+2):]
  if(password[mincount]==limitchar):
    first=1
  if(password[maxcount]==limitchar):
    second=1
  if ((first==1) or (second==1)):
    if(first !=second):
      number_valid+=1

print number_valid
#print toolow
#print toohigh
#print number_valid+toolow+toohigh

More sloppy code, i didn’t even change the variable names. Now, instead of counting and comparing, I’m checking for the special character in each position, and toggling a variable if it’s there or not, then making sure it’s nor present in both. This is where the new variables “first” and “second” come in.

This biggest challenge on Day 2 was that it required more manipulation of the input data.

Github Repository of my Solutions

Advent of Code 2020 – Day 1

So, I want to say up front, I don’t know if I will finish this, but I plan to try. Also, while I intend to publish these posts on the respective day of each challenge, I may not actually DO the challenge day of. That is to say, some of these, probably most of these, will be back dated.

Advent of Code is a little 25 day advent calendar of code based challenges. I heard it mentioned on TWIT by Leo Laporte. It can be done in any language or system. There are people who solve these using game engines and such. As this is my first go, and I am not a “professional programmer”, I am doing in in sloppy Python. I’ll be posting my solutions in a Github Repo, which means this challenge serves a second purpose of helping me have an excuse to learn how to better use Github.

The first day’s challenge is pretty simple. Given a list of numbers, figure out which ones add up to 2020 and then multiply them together. Each day has two challenges, based on the same base data set. The data set for each person seems to be different. Here is my solution for Day 1 Part 1:

with open('day1data.txt') as f:
    lines = [line.rstrip() for line in f]

for x in lines:
  for y in lines:
    if((int(x)+int(y))==2020):
      print int(x)*int(y)

Each day of this puzzle (so far) involves reading in a data file, and working with it. My solution involves looping each number, and multiplying it by each other number, and checking for if they sum 2020, then posting the result of multiplying the two numbers.

Part two is essentially the same except it involves three numbers instead of two. This pretty much just meant adding another layer of loop to my loops.

with open('day1data.txt') as f:
    lines = [line.rstrip() for line in f]

for x in lines:
  for y in lines:
    for z in lines:
      if((int(x)+int(y)+int(z))==2020):
        print int(x)*int(y)*int(z)

This first days’ challenge was pretty simple. The rest of the days, not so much.

Tracking Covid-19 into a database using Python

At some point I need to do a little write up on my Home Dashboard Project, it’s inspired quite a few minor projects such as this one to make little web widgets. The dashboard is the simple part, it’s just dumping a database query into a table. Honestly, the script was easy too, because I adapted it from another script I built recently.

With COVID-19 all over the news, I wanted to add some stats to my dashboard for my state. Not so much because there aren’t already 1000 other places to get the numbers, but more to see if I could do it. The hardest part was finding a feed to stats. Then I found CovidTracking.com. Which has a nice little API. I then set to work adapting another script to pull from this API to dump stats for Illinois into the database. I am only interested in Illinois, but the script is built so the user can put a list of states into an array, and then it will loop through and add them all to the database.

The script is below, but this also requires some set up in SQL. Nothing complicated, mostly INT fields. an id as an int and primary key, negative_cases, positive_cases, and deaths, all as INT, state as a varchar with a length of 2, though technically the length is optional, then finally date_stamp as a DATETIME field with a default value of the current timestamp. The DATETIME isn’t directly touched here, but it makes it easier to manipulate the data later.

The code also requires you enter your database credentials. I’ve nammed my table “il_covid_stats, but you can change that to whatever you want down below in the “SQL = “INSERT….” line. I’ll leave it up to you what to do with the data, I pull mine into a PHP page.

Anyway, here is the python code:

# Python Covid Star Tracking to SQL
# use of json package
# Sample URL: https://covidtracking.com/api/states?state=IL

import json
import requests
import time
import MySQLdb

mydb = MySQLdb.connect(
  host="localhost",
  user="YOUR_DB_USERNAME",
  passwd="YOUR_DB_PASSWORD",
  database="YOUR_DB_NAME"
)
mycursor = mydb.cursor()
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'

#States to check as an Array, two letter abbreviations
states = ['IL']

def data_getter(statename):
  ####when reading from remote URL
  url = 'https://covidtracking.com/api/states?state='+statename

  user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
  headers = {'User-Agent': user_agent}
  response = requests.get(url,headers=headers)
  html = response.content
  statedata = json.loads(html)

  pos_cases = (statedata['positive'])
  neg_cases = (statedata['negative'])
  deaths = (statedata['death'])

  vals = (pos_cases,neg_cases,deaths,statename)

  mysqlinsert(vals)

def mysqlinsert(vals):
  ## This database name and columns can be changed but should be pre made in your database
  SQL = "INSERT INTO il_covid_stats (positive_cases, negative_cases, deaths, state) VALUES (%s, %s, %s, %s)"
  mycursor.execute(SQL, vals)
  mydb.commit()

# Loop through URLs for each state
for i in states:
  data_getter(i)