Goodbye Netflix
Wow. I just checked, and I've had Netflix since 08/10/2001. Over thirteen years. Longer than my marriage. Two houses ago. I'm down to the cheapest one-at-a-time plan, and I still get around to it every three or four months.
I think it's time to say goodbye.
But here's how they get you to stay:
Based on your 1698 ratings, this is the list of movies and TV shows you've seen.
Yeah… thirteen and a half years of data that I don't want to lose! And that's my main account - I have two other profiles too. I searched the 'net for a solution, and came up with a lot. None worked. GreaseMonkey ones. PHP ones. None worked.
This was the closest: https://gist.github.com/tloredo/8483682
But I don't have a Mac, so I needed to manually capture that info. Ninety pages of ratings. So I used DownThemAll!. I opened the download manager manually, and for the URL I used http://dvd.netflix.com/MoviesYouveSeen?pageNum=[1:90]
- I had manually determined 90 with some trial and error. This saved all the pages to files named MoviesYouveSeen.htm
and then MoviesYouveSeen_NNN.htm
.
I modified the script to read these HTML files instead of launching Safari. After that, the ratings were off - every movie in the file would have the rating of the first in the file. So I tweaked that. For some reason, some don't show a rating in the HTML, even when these were supposedly rated. Some are "No Interest," but others, I just don't know what happened. So I have it output 0.0
if it couldn't figure it out - a 99% solution.
Here are my changes from the gitlab (17 Jan 2014) version (depending on screen width, you might have to scroll way down):
-
.py
old new 1 #!/bin/env python 2 # Original @ https://gist.github.com/tloredo/8483682 1 3 """ 2 4 Scrape a user's Netflix movie ratings by automating a Safari browsing 3 5 session (with the user already logged in). The ratings are written … … 106 108 107 109 from jinja2 import Template 108 110 from lxml import html 111 import re 112 113 fname_regex = re.compile(r'(\w+?)_?(\d+)?\.(\w+)') 114 rating_regex = re.compile(r'You rated this movie: (\d)\.(\d)') 109 115 110 116 111 117 # AppleScript functions asrun and asquote (presently unused) are from: … … 159 165 All values are strings. 160 166 """ 161 167 # Load the page, grab the HTML, and parse it to a tree. 162 script = ASTemplate.render(URL=url, DTIME=dtime) 163 reply = asrun(script) 168 reply = '' 169 try: 170 with open(url) as infile: 171 for str_ in infile: 172 reply += str_ 173 except IOError: 174 return [], None 175 176 164 177 tree = html.fromstring(reply) 165 178 rows = tree.xpath('//table[@class="listHeader"]//tr') 166 179 … … 180 193 # changing from page to page. For info on XPath for such cases, see: 181 194 # http://stackoverflow.com/questions/8808921/selecting-a-css-class-with-xpath 182 195 # rating = data[3].xpath('//span[@class="stbrMaskFg sbmfrt sbmf-50"]')[0].text_content() 183 rating = data[3].xpath('//span[contains(concat(" ", normalize-space(@class), " "), " stbrMaskFg ")]')[0].text_content() 184 rating = rating.split(':')[1].strip() # keep only the number 196 rating_cut = rating_regex.match(data[3].text_content()) 197 rating = '0.0' 198 if rating_cut: 199 rating = "%s.%s"%(rating_cut.group(1), rating_cut.group(2)) 200 185 201 info.append((title, year, genre, rating)) 186 202 187 203 # Next URL to load: 188 next_elem = tree.xpath('//li[@class="navItem paginationLink paginationLink-next"]/a') 189 if next_elem: 190 next_url = next_elem[0].get('href') 191 else: # empty list 192 next_url = None 204 fname_cut = fname_regex.match(url) 205 if fname_cut: 206 if None == fname_cut.group(2): 207 num = 0 208 else: 209 num = fname_cut.group(2) 210 next_url = "%s_%03.f.%s"%(fname_cut.group(1),int(num)+1,fname_cut.group(3)) 211 else: 212 print "Regex failed." 213 next_url = None 214 193 215 194 216 return info, next_url 195 217 196 218 197 219 # Use this initial URL for DVD accounts: 198 url = ' http://dvd.netflix.com/MoviesYouveSeen'220 url = 'MoviesYouveSeen.htm' 199 221 # Use this initial URL for streaming accounts: 200 222 # url = 'http://movies.netflix.com/MoviesYouveSeen' 201 223
This renders a lot of the script useless, but there's no benefit in making the diff
larger so I didn't trim anything else.
Here's when I ran it across my "TV Queue" account - yeah they're not all TV, sometimes I accidentally rated things with the wrong profile:
$ ./ScrapeNetflixRatings.py Scraping MoviesYouveSeen.htm 1: Garmin Streetpilot 2610/2650 GPS (2003) [Special Interest] - 1.0 2: Six Feet Under (2001) [Television] - 0.0 Scraping MoviesYouveSeen_001.htm 3: The Thief of Bagdad (1924) [Classics] - 4.0 4: The Tick (2001) [Television] - 4.0 5: Michael Palin: Pole to Pole (1992) [Documentary] - 0.0 6: Kung Fu: Season 3 (1974) [Television] - 0.0 7: Danger Mouse (1981) [Children & Family] - 3.0 8: Farscape (1999) [Television] - 3.0 9: Helvetica (2007) [Documentary] - 3.0 10: Hogan's Heroes (1965) [Television] - 3.0 11: The Lion in Winter (2003) [Drama] - 3.0 12: Monty Python: John Cleese's Best (2005) [Television] - 3.0 13: Sarah Silverman: Jesus Is Magic (2005) [Comedy] - 3.0 14: Stephen King's It (1990) [Horror] - 3.0 15: Superman II (1980) [Action & Adventure] - 3.0 16: Superman: The Movie (1978) [Classics] - 3.0 17: Tom Brown's Schooldays (1951) [Drama] - 3.0 18: An Evening with Kevin Smith 2 (2006) [Comedy] - 0.0 19: Crimewave (1986) [Comedy] - 2.0 20: Huff (2004) [Television] - 2.0 21: Aqua Teen Hunger Force (2000) [Television] - 1.0 22: The Boondocks (2005) [Television] - 1.0 Scraping MoviesYouveSeen_002.htm 23: Ricky Gervais: Out of England (2008) [Comedy] - 5.0 24: Robot Chicken (2005) [Television] - 5.0 25: Robot Chicken Star Wars (2007) [Comedy] - 5.0 26: Rome (2005) [Television] - 5.0 27: Scrubs (2001) [Television] - 5.0 28: Stewie Griffin: The Untold Story (2005) [Television] - 5.0 29: Spaced: The Complete Series (1999) [Television] - 0.0 30: Alice (2009) [Sci-Fi & Fantasy] - 0.0 31: Best of the Chris Rock Show: Vol. 1 (1999) [Television] - 4.0 32: The Critic: The Complete Series (1994) [Television] - 4.0 33: Dilbert (1999) [Television] - 4.0 34: An Evening with Kevin Smith (2002) [Comedy] - 4.0 35: John Adams (2008) [Drama] - 4.0 36: King of the Hill (1997) [Television] - 4.0 37: The Lone Gunmen: The Complete Series (2001) [Television] - 4.0 38: Neverwhere (1996) [Sci-Fi & Fantasy] - 4.0 39: Robin Hood (2006) [Television] - 4.0 40: The Sand Pebbles (1966) [Classics] - 4.0 41: The Sarah Silverman Program (2007) [Television] - 4.0 42: The Silence of the Lambs (1991) [Thrillers] - 4.0 Scraping MoviesYouveSeen_003.htm 43: Alias (2001) [Television] - 5.0 44: Alien (1979) [Sci-Fi & Fantasy] - 5.0 45: Band of Brothers (2001) [Drama] - 5.0 46: Bleak House (2005) [Drama] - 5.0 47: Brisco County, Jr.: Complete Series (1993) [Television] - 5.0 48: Code Monkeys (2007) [Television] - 5.0 49: Coupling (2000) [Television] - 5.0 50: Dead Like Me (2003) [Television] - 5.0 51: Deadwood (2004) [Television] - 5.0 52: Family Guy (1999) [Television] - 5.0 53: Family Guy: Blue Harvest (2007) [Television] - 5.0 54: Firefly (2002) [Television] - 5.0 55: Futurama (1999) [Television] - 5.0 56: Futurama the Movie: Bender's Big Score (2007) [Television] - 5.0 57: The Great Escape (1963) [Classics] - 5.0 58: Greg the Bunny (2002) [Television] - 5.0 59: How I Met Your Mother (2005) [Television] - 5.0 60: MI-5 (2002) [Television] - 5.0 61: My Name Is Earl (2005) [Television] - 5.0 62: Police Squad!: The Complete Series (1982) [Television] - 5.0 Scraping MoviesYouveSeen_004.htm
Thanks a ton to the original author, and the full version is attached here for posterity.
IP Address in Python (Windows)
From StackOverflow, my changes:
- Py3 compat (no big deal)
- Added DHCP support
- Use CurrentControlSet (saner IMHO)
Python deepcopy broken
Well, that was annoying… spent a long time last Friday and today to find out that Python 2.7's copy.deepcopy
doesn't play well with xml.dom.minidom
. See this bug report.
The workaround is to use "doc.cloneNode(True)
" instead.
Email your new IP address with TomatoUSB
So my router is now TomatoUSB and I wanted an alert when the IP changed. Sure, I could probably put something local on the router, but where's the fun in that?
So I put together a quick python script to drop me an email if the IP ever changes. Yes, TomatoUSB supports various Dynamic DNS services, but doesn't seem to natively support "email me."
So on the DDNS setup page, I chose the "Custom URL" service, and I put in "http://192.168.90.99/IPCHECKS?new_ip=@IP
" as the URL (the internal address of an Apache server running WSGI.
I have a custom config file /etc/httpd/conf.d/wsgi_IP
as follows:
WSGIScriptAlias /IPCHECKS /var/www/wsgi/IP.wsgi <Directory "/var/www/wsgi/"> WSGIApplicationGroup %{GLOBAL} Order deny,allow Deny from all Allow from 192 127 ::1 </Directory>
HOPEFULLY that means none of you can change what I think my IP address is.
Here's the actual python script (/var/www/wsgi/IP.wsgi
):
from __future__ import print_function from cgi import parse_qs, escape import socket import smtplib # This is RevRagnarok's ugly IP checker. # Tomato (firmware) will post to us with a "new_ip" parameter # At this point, I want to see manually that the IPs change, not have it autoupdate # Note: I had to enable HTTP sending email in SELinux: # setsebool -P httpd_can_sendmail 1 def application(environ, start_response): parameters = parse_qs(environ.get('QUERY_STRING', '')) if 'new_ip' in parameters: newip = escape(parameters['new_ip'][0]) else: newip = 'Unknown!' start_response('200 OK', [('Content-Type', 'text/html')]) # Look up DNS values oldip = socket.gethostbyname('revragnarok.com') # Yes, IPv4 only # Compare changed = '' if newip != oldip: changed = 'IP changed from {0} to {1}.'.format(oldip, newip) if changed: e_from = '[email protected]' e_to = ['[email protected]'] e_msg = """Subject: IP Address change detected {0}""".format(changed) # I considered a try/catch block here, but then what would I do? smtpObj = smtplib.SMTP('localhost') smtpObj.sendmail(e_from, e_to, e_msg) else: changed = '(unchanged)' changed = 'IP is {0} (unchanged).'.format(newip) return [changed]
And don't forget, if you use SELinux, fix permissions on the script, and allow the webserver to send email:
[root@webserver wsgi]# ls -Z IP.wsgi -rw-r--r--. root root system_u:object_r:httpd_sys_script_exec_t:s0 IP.wsgi [root@webserver wsgi]# setsebool -P httpd_can_sendmail 1