“I promise with my hand on a Bible that your data is not being archived and sold, […] We don’t know what any particular person is watching,” he said. “We only know what a random, anonymous sampling of our user base is watching.”
So says the CEO for Tivo, according to a recent article in San Francisco Chronicle. The data in question is whether subscribers are skipping commercials. This is a classic case of having to place blind faith in hardware, or at least in the marketing proclamations of the vendor. The TiVo device sitting in the consumer’s living room certainly has visibility into what is being watched and how often the commercial skip feature is used to avoid going postal over that lame beer commercial again. But what is not clear is whether this information is shipped off the box to headquarters, for data mining purposes and if it is, to what extent it is sanitized to strip identifying information about the original user.
Problem is only Tivo engineers can know for sure– and even they may not have it right. One person’s “anonymized data-set” is another’s treasure find of personal data waiting to be correlated against just the right database to reveal the identity behind each record. For everyone else Tivo is a blackbox. The only sources of information are:
- Vendor claims, to the extent they are complete and accurate
- Third-party claims, such as privacy advocates assuming they have better sources of information
- Information gathered by reverse engineering the device. This is costly and returns on investment can be low. Often vendors intentionally obfuscate their protocol in order to protect their intellectual property. (Conspiracy theorists would argue obfsucation only serves to hide nefarious purpose.)
Tivo is neither unique or particularly significant. The question of whether a device owned by the user is acting against their interests comes up all the time. A deceptive short-cut is that open source software is better because anybody can verify it is working as intended. MythTV instead of Tivo? True– in the trivial sense that, if you went over every line of code and built it from scratch yourself. (Otherwise you are at the mercy of the authors, download sites etc.) That approach does not scale and better trust mechanisms are called for. Marketplace reputation of an established company in principle serves as a check: too many eggregious data collection practices equates to lost revenue. But such dynamics can only operate when there is transparency and competition: when users know exactly how 2 different PVR vendors use their data, and factor this into their purchasing decision. We are far from that level of awareness.