Software auto-updates: revisiting the trade-offs

Auto-updating software sounds like a great idea on paper: code gets better and repairs its own defects magically, without the user having to lift a finger. In reality it has a decidedly mixed track record. In case you missed the latest example, a buggy Windows 10 update caused machines to get into reboot loop. This is far from the first time that Redmond shipped faulty software updates. But it raises the stakes for automatic-update features, since MSFT has drawn a line in the sand with mandatory updates in Windows 10.

That makes it a good time to review the arguments for and against auto-updates:

  • Customers get a better product sooner, with fewer defects and enhanced functionality. Often users lack incentives to go out of their way to apply updates. Significant improvements could be hidden under the hood, without the benefit of shiny objects to lure users. Mitigations for security vulnerabilities are the prime example of invisible yet crucial improvements. In the absence of automatic mechanism for applying updates, few people would go out of their way to install something with non-descriptive name along the lines of “KB123456 Fix for CVE-2015-1234.” (But this may be changing, now that mainstream media routinely covers actively exploited vulnerabilities in Adobe Reader, Flash and IE. It’s as if journalists were enlisted into a coordinated public awareness campaign for security updates.)
  • Makes life easier for the vendor, with long-term benefits for customers. Having all users on the latest version of the product greatly reduces development costs, compared to actively supporting multiple versions for customers who have decided to not update. All new feature development happens against the latest version. Security fixes only have to be developed once, not multiple times for slightly different code-bases each with their quirks. Quality assurance is also helped by having only one version to check against, reducing the probability of buggy updates.
  • In some cases the positive externalities extend beyond the software publisher. Keeping all users on the latest and greatest version of a platform can boost the entire ecosystem. When there are few versions of an application floating in use, other people building on top of that application also have an easier time. Remember the never-ending saga of Internet Explorer on XP? For years versions before IE9 were the bane of web developers: no support for modern HTML5, idiosyncratic security problems such as content-sniffing and random departures from web standards implemented faithfully by every other browser. One site went so far as to institute a surcharge for users on IE7, to compensate for the extra work required to support them. But IE versions did not start out that way: when released in 2001 IE6 was arguably a perfectly satisfactory piece of code. With MSFT having no leverage to migrate those users to a newer version, the company created a massive legacy problem not only for itself but for anyone trying to design a modern website who had to contend with the quirks and limitations of 10+ year old technology.

Downsides to auto-updating break down into several categories:

  • Collateral damage. This is probably the most common complaint about updates gone wrong. There seems to be a paucity of evidence around what percent of Windows updates need to be recalled due to bugs—and MSFT may be understandably reluctant to release that figure— but public instances of updates gone awry are ubiquitous.
  • Downtime. Often updates require restarting the application, if not rebooting the machine altogether. This represents downtime and some loss of productivity, although the impact varies greatly and can be managed with judicious scheduling. Individual machines are rarely used 24/7 and updates scheduled at off-hours can be transparent. On the other hand rebooting the lone server supporting a global organization incurs a heavy cost; there may be no good time for that. (It also points to a design flaw in the IT infrastructure with one machine constituting a single-point-of-failure without redundancy.)
  • Revenue model. If all updates are given away for free, typically required for auto-updating, significant monetization opportunity is lost.  This is a problem for the vendor rather than customers, specific to business models relying on selling discrete software bundles, as opposed to a subscription service along the lines of Creative Cloud. But economics matter. This inconvenient fact lies at the heart of Android security update debacle– with no upside from delivering updates to a phone that has been already paid for, neither OEMs or carriers have slightest interest in shipping security fixes. Usually there is some line drawn between incremental improvements vs significant changes that merit an independent purchase. For example MSFT always shipped service packs free of charge while requiring new licenses for OS releases— until Windows 10, which breaks that pattern by offering a free upgrade for existing Windows 7/8 users.
  • Vendor controlled backdoor. Imagine you have an application running on your machine that calls out to a server in the cloud controlled by a third-party, receives arbitrary instructions and starts doing exactly what was prescribed in those instructions. One might rightly call that a backdoor or remote-access Trojan (RAT).  Auto-update capabilities are effectively no different, only legitimized by virtue of that third-party being some “reputable” software publisher. But security dependencies are not transformed away by magical promises of good behavior: if the vendor experiences an intrusion into their systems, the auto-update channel can now become a vector for targeting anyone running that application. It’s tempting to say that dependency (or leap-of-faith, depending on your perspective) already exists when installing an application written by the vendor. But there is an important difference between a single act trusting the integrity of one application as point-in-time decision, versus ongoing faith that the vendor will be vigilant 24/7 about safeguarding their update channel.

So what is a reasonable stance? There are at least two situations where disabling auto-updates makes sense. (Assuming the vendor actually provided controls for doing that. Many companies including Google in the early-days were a little too enthusiastic about forcing software on users.)

  1. Managed IT environments, or so-called “enterprise” scenario. These organizations have in-house IT departments capable of performing additional quality-assurance on the update before rolling it out to thousands of users. More importantly that QA can use a realistic configuration that mirrors their own deployment, as opposed to the generic or stand-alone testing. For example an update may function just fine on its own, but have a bad interaction with antivirus from vendor X or VPN client Y. Such  combinations can not be exhaustively checked by the original publisher.
  2. Data-centers and server environments. Regardless of the amount of redundancy present in a system, having servers update on their own without admin oversight is recipe for unexpected downtime.

In these situations the benefits of auto-updating outweighed the risks. By contrast that calculus gets inverted in the case of consumers, with the possible exception of power users. Most home users are neither in a position nor have the inclination to verify updates in an isolated environment, short of actually applying the update to one of their machines. The good news is that most end-user devices are not mission-critical either, in the sense that downtime does not inconvenience a large number of other people relying on the same machine. In these situations little is gained by delaying updates. It may buy a little extra insurance against the possibility that the vendor discovers some new defect (based on the experience of other early-adopters) and recall the update. But for critical security updates, that insurance comes at the cost of living with a known security exposure just as the release of a patch starts the clock for reverse-engineering the vulnerability to develop an exploit.