Musings, ramblings, rantings about technology, games, puzzles, and whatever else catches my attention.
Saturday, May 2, 2015
SIFF 2015 Scheduling (Part 1?)
It's time for the 2015 Seattle International Film Festival. Last year, I saw somewhere around 30 movies. This year, I think I might be able to do around 50. I've put some hurdles in my way like deciding to go into work most of the days of the festival, and maybe I won't see the midnight movies like I might have when I was younger.
I spent some time going through SIFF's festival website, and got somewhat impatient and frustrated with what it provided me, so I pulled down the data of movies I was (somewhat?) interested in.
SIFF's website lets you add movies to "MySIFF", which is then presented in one page of scrapable HTML. I used the Python "Beautiful Soup" tool to find the appropriate chunk of data inside the HTML file, then I carved out individual chunks of data that I wanted to track. I bundled those bits back up and sent everything out to a CSV file, which Google Sheets (or "Google Docs Spreadsheets", as I call it) was happy to import.
I did a tiny bit of special-case formatting in my script so that the movie "808" renders as a string instead of an integer. I didn't special-case that movie, but I did recognize that there might be titles that get interpreted as integers.
Next Up(?) - constraint satisfaction to help me understand when I should see a particular screening of a movie (movie A conflicts with movie B now, but movie A is only showing now, but movie B has no conflicts next Tuesday, therefore, movie A is the better option now).
Labels:
Beautiful Soup,
Python,
scraping,
SIFF
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment