Sprints and marathons: why developer interviews miss the mark


Qualifiers

The New York City marathon is one of the preeminent long-distance races in the world. Not surprisingly it comes with stringent qualification criteria, based on having completed either another full or half marathon at an eligible event. Imagine an alternate universe instead: runners qualify for the marathon based on their 100 meter sprint times. No doubt some of the usual entrants could still meet this unusual bar. But there would also be many false positives: the people have no problem sprinting blazingly fast over short distances but run out of fuel and drop-out of the race after a couple of miles. There would also be many false negatives— remarkable endurance athletes who never make it to the start line because the qualifiers screened for a criteria highly uncorrelated to what they are expected to perform.

That absurd hypothetical is not far from the state of the art in interviewing software engineers at leading technology companies. This blogger is certainly not first or most eloquent with a scathing criticism of the leading paradigm for conducting developer interviews. (In fact he has been a happy beneficiary of that broken model early on, one of its false positives.) But faith in measuring candidates based on their performance on contrived problems remains unshakable in many corners of the industry, from garden variety starts up in the valley to hedge-funds in NYC.

Introducing “The Problem”

With minor variations, the setup is same everywhere. Candidate walks into a room. An interviewer is sitting there, occasionally flanked by a shadow, a silent colleague who is there to observe the proceedings before he/she can conduct similar interrogations in the future. They engage in nervous chit-chat and pleasantries for a few minutes, before the interviewer walks over to a white-board and presents the candidate with “The Problem.” Invariably The Problem has two aspects—in fact novice interviewers often conflate these two problems when providing feedback about the candidate:

  • Algorithmic: This is the abstract logic piece, focused on puzzle-solving. It skews heavily towards the theory side, calling for insights into the underlying structure of a problem (“this is just an instance of network flows in disguise”) and benefits from familiarity with algorithms.
  • Coding: Translating the abstract solution into code expressed in a programming language. Often times the choice of that language is dictated by the company (“We are strictly a Rails shop and require that you are already familiar with Ruby”) In more enlightened circumstances, it may be negotiated between interviewer and candidate on the reasonable assumption that a good developer can easily become proficient at a new language or framework on the job.

Sometimes code is written on a white-board, without much attention paid to formatting or  the occasional bogus syntax. Other times candidates are sat down in front of an actual computer, provided some semblance of a work environment and asked to produce working code. (There are even websites such as CoderPad designed for remote coding interviews, with fancy syntax highlighting in several popular languages.) This is supposed to be an improvement both for the accuracy of the interview and “user-friendliness” of the process for the candidate. On the one hand, with a real compiler scrutinizing every semi-colon and variable reference, he/she must produce syntactically valid code. No sketching vague notions in pseudo-code. At the same time an IDE makes it much easier to write code the way it is normally done: jumping back and forth, moving lines around, deleting them and starting over, as opposed to strictly top-to-bottom on a white board. (Of course it is difficult to replicate the actual work space an engineer is accustomed to. It involves everything from the available screen real estate- some will swear by multiple monitors, others prefer a single giant monitor rotated into portrait mode. Experienced developers can be surprisingly finicky about that, having over-adapted to a very specific set up. Try asking an Emacs user to work in Vim.)

Logistics

Putting aside the logistics of catering to developer preferences, what is the fundamental problem with this approach?

First the supply of good problems is limited. An ideal instance of “The Problem” provides more than a binary success/failure outcome. It permits a wide range of solutions, from the blindingly obvious and inefficient to the staggeringly elegant/difficult/subtle. A problem with a single “correct” solution requiring a flash of insight is a poor signal, similar to a pass/fail grade. Far more useful are those with complex trade-offs (faster but using more memory, slower but permits parallelization etc.) candidates can make progress incrementally and continue to refine their answers throughout.

Second there is an arms race between those coming up with questions and websites trying to prep candidates by publishing those questions. At Google we used to have an internal website where employees could submit these problems and others could vote on them. Some of the best ones inevitably got stamped with a big red “banned” after they had been leaked on interview websites such as Glassdoor. That even gives rise to the occasional problem of outright dishonest performances: the candidate who professes to never having encountered your question in the past, struggles through the interview and yet amazingly proceeds to parrot out the exact solution from the website last-minute.

Sprinting vs running marathons

Focusing on the difficulty of choosing an “optimal interview problem” is still missing the forest for the trees. A fundamental problem is that the entire setup is highly contrived and artificial, with no relationship to the day-to-day grind of software development in commercial settings.

To put it bluntly: most commercial software development is not about advancing the state of the art in algorithms or bringing new insights into complex problems in theoretical computer science— the type of “puzzle solving” celebrated at nano-scale by interview problems. That is not to say interesting problems never come up. But to the extent such innovations happen, they are motivated by work on real-world applications and their solution is the result of deliberate, principled work in that domain. Few engineers accidentally stumble into a deep theory question while fixing random bugs and then proceed to innovate their way (or failing that, give up on the challenge) in 45 minutes.

Much larger chunks of developer time are spent implementing solutions to problems that are well-understood and fully “solved” from theoretical perspective. Solved in quotes, because a robust implementation makes all the difference between a viable, successful product and one that goes nowhere. This is where the craft of engineering shines. That craft calls for a much broader set of skills than abstract puzzle-solving. It means working within pragmatic constraints, choosing to build on top of existing frameworks, however buggy or quirky they may be. (That is the anti-thesis of writing a solution from scratch out of thin-air. So-called greenfield projects where one gets the luxury of a clean slate are the exception rather than the norm in development.) It means debugging: investigating why a piece of code, already written by someone else with an entirely different approach, is not performing as expected. It means plenty of bug fixes and small tweaks. Making judgment calls on when to rewrite faulty components from scratch and when to continue making scoped, tactical fixes until there is more room in the schedule for an overhaul. Last but not least: working as part of a team, with complex dependencies and ever-shifting boundaries of responsibility.

Commercial software engineering is an endurance event over a highly uneven and unpredictable terrain full of obstacles, unfolding on the scale of weeks and months. (It is no accident that one of the earliest and better known books about software project management was titled “Death March.”) Many fashionable paradigms have come and gone— extreme programming, test-driven development, scrum with its emphasis on sprints— but none have altered the fundamental dynamics. Programming interviews by contrast are sprints on a perfectly flat indoor track, no matter how hard they strive to recreate a superficial similarity to actual software engineering.

The mismeasure of a developer

Why does the industry continue to evaluate engineers in this artificial way? One explanation is that the process is a relic of the 1990s, pioneered by companies such as Microsoft before there was a better way to evaluate the output of a developer.

In the days of shrink-wrap software, it was difficult to evaluate individual contributions in isolation from the larger project that person worked on. Imagine a candidate with resume credits on very large-scale applications such as MSFT Word or IBM OS/2. (Putting aside the question of how one feels about the relative quality of either of those products.) Thousands of people contributed to such a large code base. Did the engineer play a pivotal role or did they simply tweak a few lines of code at the periphery, fix a few inconsequential bugs? The opaqueness of proprietary development ensures this will be  difficult to ascertain. Meanwhile the Dunning-Kruger effect makes even the most sincere candidate self-evaluation suspect. It is not only successes that are difficult to highlight; failures are also easily swept under the rug. If you were the engineer in charge of enthusiastically integrating Clippy into MSFT Office or your code was riddled with critical security vulnerabilities that were later patched by other people, those details may be suspiciously absent from your resume.

Proprietary development has not gone away, but the good news is there is a growing amount of open-source software and it is easier to find than ever. It is no longer about downloading tarballs of source code from an obscure FTP site. With a Github handle listed on a resume, it is possible to browse and search through an individual developer’s contributions in a very systematic fashion. Those public repositories may still be that proverbial tip of the iceberg in relative proportion to their total output. For many people open-source contributions still remain a small fraction of contributions compared to private projects undertaken as part of a commercial product. But it still paints a far more accurate picture of their talent as developer- someone engaged in the craft of writing code, as opposed to solving logic puzzles. That is a picture that no 45-minute whiteboard interview can do justice to.

CP

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s