Tabs, spaces and the straitjacket of coding conventions

(An attempt to account for StackOverflow survey results)

In June StackOverflow published one of the more surprising results from their developer survey: “Developers Who Use Spaces Make More Money Than Those Who Use Tabs.” This is a counterintuitive result, considering that the question of spaces and tabs has always been considered a matter of personal preference, with no impact on the actual quality of code written. So much that this debate made it into an episode of “Silicon Valley,” a show predicated on repackaging the follies of our technology sector for popular consumption. Yet here is a survey from highly-regarded website with a statistically significant number of respondents, suggesting that this superficial stylistic difference affects career prospects. It would be akin to learning that traders who wore suspenders generated 10% more profits than their colleagues wearing belts. What is going on here? After all, switching from tabs to spaces is trivial and can be automated with a text editor. Should engineers around the world embark on a global search-and-replace across every file to give themselves a raise?

To their credit, StackOverflow authors realize this claim sounds absurd on the surface and try to find alternative explanations that could account for the observed difference in terms of variables that . Going back to our previous parallel situation drawn from finance: if there was a convention that people trading commodities wear belts while those speculating in equities prefer suspends, that would account for the difference in observed returns. In effect the belt/suspenders question is not a casual factor; it is just a proxy for an underlying “hidden” variable which influences the outcome in reality. If we could perform a controlled experiment where fixed-income traders were all given a wardrobe make-over and switched to suspenders, they would sadly not turn into rainmakers for their employer.

Yet in the case of tabs vs spaces, that observed difference in salary persists after controlling for obvious variables such as choice of programming language and specific area of development such as web/mobile/embedded. Across every category, the difference is in the same direction: those using spaces earn more. More importantly the StackOverflow data is looking at median salary instead of averages. Compared to an average, that statistic is much less susceptible to being skewed by a handful of “space” fanatics in a particular niche (such as ICO development) with outsized numbers.

Here is an alternative explanation: for a large number of developers, the choice of tabs vs spaces is not a reflection of personal preference but dictated by the coding conventions of their project. Observed tabs/spaces difference would then follows naturally from intrinsic differences in compensation between organizations who adopt strict formatting guidelines (specifically, mandating spaces) and those who are relatively liberal about formatting.

Coding style guides are common in large organizations to maintain consistent standards across their coding base. These standards can cover everything from very high-level/substantial topics such as language features permitted to trivial ones such as the placement of braces and line-breaks. For example, here is the Google C++ guide which opines on everything from use of namespaces to the evils of virtual inheritance and operator overloading—verboten at Google. (In fact by disallowing most advanced C++ features, this style of coding effectively turns C++ development at Google into a glorified C-with-classes circa 1990s.) The goal of consistency is to make it as easy as possible for engineer Alice to dive into a new code-base and review/improve code written by engineer Bob. If Bob was allowed to use arcane language features or esoteric design patterns in his work, it would be a lot more difficult for Alice to come up to speed and contribute. A more cynical interpretation is that conventions make engineers interchangeable, helping the organization at the expense of individual employees and overall productivity.

Indentation falls into that second category of “cosmetic” changes that have no impact on the semantics of code: how far each nested block of code is offset from the enclosing block, and whether that layout effect is achieved by using a single tab character or some fixed number of spaces. (Granted there are languages such as Python where indentation does change the meaning of  what code does. But even in that case, it remains immaterial whether that information is conveyed using spaces or tabs.)

Now if Alice is employed by a large company that mandates spaces while Bob is working for a scrappy start-up with laissez faire approach to formatting, they may end up using different indentation style even if Alice started out with no preference either way. The difference in compensation could simply reflect the underlying difference in the type of organizations that adopt coding conventions, assuming such conventions are more likely to favor spaces over tabs— as in the case of C++/Java/Python at Google or C# at MSFT. In that case the higher earnings for the spaces camp could be an artifact of large, established employers relying on high cash compensation while small start-ups lean heavily on equity such as stock options to attract candidates.

There is an ambiguity here in the phrasing of the survey question: whether it is asking for preference or status quo. The question “spaces or tabs” can be interpreted two ways:

  • Are you currently using spaces or tabs?
  • If you had your druthers, would you prefer to use spaces or tabs?

The first question is influenced by choice of employer, and the apparent correlation to salary could then be explained away as an artifact of a handful of large, successful tech companies having settled on coding conventions with spaces. The second is strictly a matter of individual preference; it would indeed be a very surprising result if it turns out that developers who had an innate preference for using spaces, absent any organizational mandate, somehow proved to be better compensated. (Of course it is also possible that company-mandated coding conventions are over time embraced and internalized by individuals subject to abiding by those rules. Coders who start out without any preference may later come to decide that there is “one-true-way” of indentation, namely the one they have been using all along.)


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s