A recent NYT expose on ClearView only scratches the surface on the problems with outsourcing critical law-enforcement functions to private companies. There is a lot of To recap: ClearView AI is possibly the first startup to have commercialized face-recognition-as-a-service (FRaaS?) and riding high on a recent string of successes with police departments in the US. The usage model could not be any easier: upload an image of a person of interest, ClearView locates other pictures of the same person from its massive database of images scraped from public sources such as social media. Imagine going from a grainy surveillance image taken from a security camera to the LinkedIn profile of the suspect. It is worth pointing out that the services hosting the original images including Facebook were none too happy about the unauthorized scraping. Nor was there any consent from users to participate in this AI experiment; as with all things social-media, privacy is just an afterthought.
Aside from the blatant disregard for privacy, what could go wrong here?
NYT article already hints at one troubling dimension of the problem. While investigating ClearView, the NYT journalist asked various members of police departments with authorized access to the system to search for himself. This experiment initially turned up several hits as expected, demonstrating the coverage of the system. But halfway through the experiment, something strange happened: suddenly the author “disappeared” from the system with no information returned on subsequent searches, even when using the same image successfully matched before. No satisfactory explanation for this came forward. At first it is chalked up to a deliberate “security feature” where the system detects and blocks unusual pattern of queries— presumably the same image being searched repeatedly? Later the founder claims it is a bug and it is eventually resolved. (Reading between the lines suggests a more conspiratorial interpretation: ClearView gets wind of a journalist writing an expose about the company and decides to remove some evidence that demonstrates the uncanny coverage of its database.)
Going with Hanlon’s razor and attributing this case of the “disappearing” person to an ordinary bug, the episode highlights two troubling issues:
- ClearView learns which individuals are being searched
- ClearView controls the results returned
Why is this problematic? Let’s start with the visibility issue, which is practically unavoidable. This means that a private company effectively knows who is under investigation by law enforcement and in which jurisdiction. Imagine if every police department CCed Facebook every time they sent an email to announce that they are opening an investigation into citizen John Smith. That is a massive amount of trust placed in a private entity that is neither accountable to public oversight nor constrained by what it can do with that information.
Granted there are other situations when private companies are necessarily privy to ongoing investigations. Telcos have been servicing wiretaps and pen-registers for decades and more recently ISPs have been tapped as a treasure trove of information on the web-browsing history of their subscribers. But as the NYT article makes clear, ClearView is no Facebook or AT&T. Large companies like Facebook, Google and Microsoft receive thousands of subpoenas every year for customer information, and have developed procedures over time for compartmentalizing the existence of these requests. (For the most sensitive category of requests such as National Security Letters and FISA warrants, there are even more restrictive procedures.) Are there comparable internal controls at ClearView? Does every employee have access to this information stream? What happens when one of those employees or one of their friends becomes the subject of an investigation?
For that matter, what prevents ClearView from capitalizing on its visibility into law-enforcement requests and trying to monetize both sides of the equation? What prevents the company from offering an “advance warning” service— for a fee of course— to alert individuals whenever they are being investigated?
Even if one posits that ClearView will act in an aboveboard manner and refrain from abusing its visibility into ongoing investigations for commercial gain, there is the question of operational security. Real-time knowledge of law enforcement actions is too tempting a target for criminals and nation states like to pass up. What happens when ClearView is breached by the Russian mob or an APT group working on behalf of China? One can imagine face-recognition systems also being applied to counter-intelligence scenarios to track foreign agents operating on US soil. If you are the nation sponsoring those agents, you want to know when their names come under scrutiny. More importantly you care whether it is the Poughkeepsie police department or the FBI asking the questions.
Being able to modify search results has equally troubling implications. It is a small leap from alerting someone that they are under investigation to withholding results or better yet, deliberately returning bogus information to throw off an investigation or frame an innocent person. The statistical nature of face-recognition and incompleteness of a database cobbled together from public sources makes it much easier to hide such deception. According to the Times, ClearView returns a match only about 75% of the time. (The article did not cite a figure for the false-positive rate, where the system returns results which are later proven to be incorrect.) Results withheld on purpose to protect designated individuals can easily blend in with legitimate failures to identify a face. Similarly ClearView could offer “immunity from face recognition” under the guise of Right To Be Forgotten requests, offering to delete all information about a person from their database— again for a fee presumably.
As before, even if ClearView avoids such dubious business models and remains dedicated to maintaining the integrity of its database, attackers who breach ClearView infrastructure can not be expected to have similar qualms. A few tweaks to metadata in the database could be enough to skew results. Not to mention that a successful breach is not necessary to poison the database to begin with: Facebook and LinkedIn are full of fake accounts with bogus information. Criminals almost certainly have been building such fake online personae by mashing bits of “true” information from different individuals.
This is a situation where ClearView spouting bromides about the importance of privacy and security will not cut it. Private enterprises afforded this much visibility into active police investigations and with this much influence over the outcome of those investigations need oversight. At a minimum companies like ClearView must be prevented from exploiting their privileged role for anything other than the stated purpose— aiding US law enforcement agencies. They need periodic independent audits to verify that sufficient security controls exist to prevent unauthorized parties from tapping into the sensitive information they are sitting on or subverting the integrity of results returned.