Steven M. Bellovin, Matt Blaze, Sandy Clark, Susan Landau, Lawful Hacking: Using Existing Vulnerabilities for Wiretapping on the Internet

Steven M. Bellovin, Matt Blaze, Sandy Clark, Susan Landau, Lawful Hacking:  Using Existing Vulnerabilities for Wiretapping on the Internet

Comment by: Anne McKenna

PLSC 2013

Published version available here:

Workshop draft abstract:

For years, legal wiretapping was straightforward: the officer doing the intercept connected a tape recorder or the like to a single pair of wires. By the 1990s, though, the changing structure of telecommunications—there was no longer just “Ma Bell”  to talk to—and new technologies such as ISDN and cellular telephony made life more complicated.  Simple technologies would no longer suffice. In response, Congress passed the Communications Assistance for Law Enforcement Act (CALEA)5, which mandated a standardized lawful intercept interface on all local phone switches.  Technology has continued to progress, and in the face of new forms of communication—Skype, voice chat during multiplayer online games, many forms of instant messaging, etc.—law enforcement is again experiencing problems. The FBI has called this “Going Dark”:6 their loss of access to suspects’ communication.  According to news reports, they want changes to the wiretap laws to require a CALEA-­‐like interface in Internet software.7


CALEA, though, has its own issues: it is complex software specifically intended to create a security hole—eavesdropping capability—in the already-­‐complex environment of a phone switch. Warnings of danger have indeed come to pass, most famously in the so-­‐called “Athens Affair”, where someone hacked into a Vodaphone Greece switch and used the built-­‐in lawful intercept mechanism to listen to the cell phone calls of high Greek officials, up to and including the Prime Minister.8 In an earlier work, we showed why extending CALEA to the Internet would create very serious problems, including very specifically creating many new security problems.

We proposed an alternative: legalized hacking, relying on the very large store of unintentional, naturally occurring existing vulnerabilities in software to obtain access to communications.  Relying on vulnerabilities and hacking, though, poses a large set of legal and policy questions.  Among these are:

  • Will it create disincentives to patching?
  • Will there be a negative effect on innovation? (Lessons from the so-­‐called
  • “Crypto Wars” of the 1990s are instructive here.)
  • Will law enforcement’s participation in vulnerabilities purchases skew the market?
  • Should law enforcement even be participating in a market where many of the sellers and other buyers are themselves criminals?
  • What happens if these tools are captured and repurposed by miscreants?
  • How does the Fourth Amendment affect use of these tools? In particular,  since they can grant full access to a computer and not just to communications, should there be statutory restrictions similar to those in the Wiretap Act?10
  • Is the probability of success from such an approach too low for it to be useful?

There are also logistical and organizational concerns. Local and even state law enforcement agencies are unlikely to have the technical sophistication to develop exploits and the legally acceptable tools to use them. This in turn implies a greater role for the FBI and its labs. Is this intrusion of Federal authorities into local policing acceptable?  Will this turn the FBI more into an intelligence agency?


1 Steven M. Bellovin is a professor of computer science at Columbia University.

2 Matt Blaze is an associate professor of computer science at the University of Pennsylvania.

3 Sandy Clark is a Ph.D. student in computer science at the University of Pennsylvania.

4 Susan Landau is a Guggenheim Fellow.

5 Pub. L. No. 103-­‐414, 108 Stat. 4279, codified at 47 USC 1001-­‐1010.

6 Valerie Caproni, General Counsel of the FBI, Statement Before the House Judiciary Committee, Subcommittee on Crime, Terrorism, and Homeland Security, February

17, 2011, available at

7 Declan McCullagh, “’Dark’ motive: FBI seeks signs of carrier roadblocks to

surveillance”, CNET News, Nov. 5, 2012, available at

8 Vassilis Prevelakis and Diomidis Spinellis, The Athens Affair, IEEE Spectrum, July 2007.

9 Steven M. Bellovin, Matt Blaze, Sandy Clark, and Susan Landau, “Going Bright: Wiretapping without Weakening Communications Infrastructure”, IEEE Security & Privacy”, Jan/Feb 201

10 In particular, see the conditions that must be satisfied in 18 USC 2518(1)(c) and the enumeration of offenses in 18 USC 2516.

Steven M. Bellovin, Renée M. Hutchins, Tony Jebara, Sebastian Zimmeck, When Enough is Enough: Location Tracking, Mosaic Theory and Machine Learning

Steven M. Bellovin, Renée M. Hutchins, Tony Jebara, Sebastian Zimmeck, When Enough is Enough: Location Tracking, Mosaic Theory and Machine Learning

Comment by: Orin Kerr

PLSC 2013

Published version available here:

Workshop draft abstract:

Since 1967, the Supreme Court has tied our right to be free of unwanted government scrutiny to the concept of reasonable expectations of privacy.5 Reasonable expectations include, among other things, an assessment of the intrusiveness of government action. When making such assessments historically, the Court has considered police conduct with clear temporal, geographic, or substantive limits. However, in an era where new technologies permit the storage and compilation of vast amounts of personal data, things are becoming more complicated. A school of thought known as “Mosaic Theory” has stepped into the void, ringing the alarm that our old tools for assessing the intrusiveness of government conduct potentially undervalue our privacy rights.

Mosaic theorists advocate a cumulative approach to the evaluation of data collection.

Under the theory, searches are “analyzed as a collective sequence of steps rather than as individual steps.”6 The approach is based on the recognition that comprehensive aggregation of even seemingly innocuous data reveals greater insight than consideration of each piece of information in isolation. Over time, discrete units of surveillance data can be processed to create a mosaic of habits, relationships, and much more. Consequently, a Fourth Amendment analysis that focuses only on the government’s collection of discrete units of trivial data fails to appreciate the true harm of long-term surveillance—the composite.

In the context of location tracking, the Court has previously suggested that the Fourth Amendment may (at some theoretical threshold) be concerned with the accumulated information revealed by surveillance.7 Similarly, in the Court’s recent decision in United States v. Jones, a majority of concurring justices indicated willingness to explore such an approach. However, in the main, the Court has rejected any notion that technological enhancement matters to the constitutional treatment of location tracking.8 Rather the Court has found that such surveillance in public spaces, which does not require physical trespass, is equivalent to a human tail and thus not regulated by the Fourth Amendment. In this way, the Court has avoided quantitative analysis of the amendment’s protections.

The Court’s reticence is built on the enticingly direct assertion that objectivity under the mosaic theory is impossible. This is true in large part because there has been no rationale yet offered to objectively distinguish relatively short-term monitoring from its counterpart of greater duration. As Justice Scalia writing for the majority in United States v. Jones, recently observed: “it remains unexplained why a 4-week investigation is ‘surely’ too long.”9 This article answers that question for the first time by combining the lessons of machine learning with mosaic theory and applying the pairing to the Fourth Amendment.

Machine learning is the branch of computer science concerning systems that can draw inferences from collections of data, generally by means of mathematical algorithms. In a recent competition called “The Nokia Mobile Data Challenge,”10 researchers evaluated machine learning’s applicability to GPS and mobile phone data. From a user’s location history alone, the researchers were able to estimate the user’s gender, marital status, occupation and age.11

Algorithms developed for the competition were also able to predict a user’s likely future position by observing past location history. Indeed, a user’s future location could even be inferred with a relative degree of accuracy using the location data of friends and social contacts.12

Machine learning of the sort on display during the Nokia Challenge seeks to harness with artificial intelligence the data deluge of today’s information society by efficiently organizing data, finding statistical regularities and other patterns in it, and making predictions therefrom. It deduces information—including information that has no obvious linkage to the input data—that may otherwise have remained private due to the natural limitations of manual and human-driven investigation. Analysts have also begun to “train” machine learning programs using one dataset to find similar characteristics in new datasets. When applied to the digital “bread crumbs” of data generated by people, machine learning algorithms can make targeted personal predictions. The greater the number of data points evaluated the greater the accuracy of the algorithm’s results.

As this article explains, technology giveth and technology taketh away. The objective understanding of data compilation that is revealed by machine learning provides important Fourth Amendment insights. We should begin to consider these insights more closely.

In four parts, this article advances the conclusion that the duration of investigations is relevant to their substantive Fourth Amendment treatment because duration affects the accuracy of the generated composite. Though it was previously difficult to explain why an investigation of four weeks was substantively different from an investigation of four hours, we now can. As machine learning algorithms reveal, composites (and predictions) of startling accuracy can be generated with remarkably few data points. Furthermore, in some situations accuracy can increase dramatically above certain thresholds. For example, a 2012 study found the ability to deduce ethnicity improved slowly through five weeks of phone data monitoring, jumped sharply to a new plateau at that point, and then increased sharply again after twenty-eight weeks. More remarkably, the accuracy of identification of a target’s significant other improved dramatically after five days’ worth of data inputs.14 Experiments like these support the notion of a threshold, a point at which it makes sense to draw a line.

The results of machine learning algorithms can be combined with quantitative privacy definitions. For example, when viewed through the lens of k-anonymity, we now have an objective basis for distinguishing between law enforcement activities of differing duration. While reasonable minds may dispute the appropriate value of k or may differ regarding the most suitable minimum accuracy threshold, this article makes the case that the collection of data points allowing composites or predictions that exceed selected thresholds should be deemed unreasonable searches in the absence of a warrant.15 Moreover, any new rules should take into account not only the data being collected but also the foreseeable improvement in the machine learning technology that will ultimately be brought to bear on it; this includes using future algorithms on older data.

In 2001, the Supreme Court asked “what limits there are upon the power of technology to shrink the realm of guaranteed privacy.”16 In this piece, we explore what lessons there are in the power of technology to protect the realm of guaranteed privacy. The time has come for the Fourth Amendment to embrace what technology already tells us—a four-week investigation is surely too long because the amount of data collected during such an investigation creates a highly intrusive view of person that, without a warrant, fails to comport with our constitutional limits on government.


1  Professor, Columbia University, Department of Computer Science.

2  Associate Professor, University of Maryland Carey School of Law.

3  Associate Professor, Columbia University, Department of Computer Science.

4 Ph.D. candidate, Columbia University, Department of Computer Science.

5  Katz v. United States, 389 U.S. 347, 361 (1967) (Harlan, J., concurring).

6  Orin Kerr, The Mosaic Theory of the Fourth Amendment, 111 Mich. L. Rev. 311, 312 (2012).

7 United States v. Knotts, 460 U.S. 276, 284 (1983).

8  Compare Knotts, 460 U.S. at 276 (rejecting the contention that an electronic beeper should be treated differently than a human tail) and Smith v. Maryland, 442 U.S. 735, 744 (1979) (approving the warrantless use of a pen register in part because the justices were “not inclined to hold that a different constitutional result is required because the telephone company has decided to automate.”) with Kyllo v. United States, 533 U.S. 27, 33 (2001) (recognizing that advances in technology affect the degree of privacy secured by the Fourth Amendment).

9  United States v. Jones, 132 S.Ct. 945 (2012); see also Kerr, 111 Mich. L. Rev. at 329-330.

10  See

11  Demographic Attributes Prediction on the Real-World Mobile Data, Sanja Brdar, Dubravko Culibrk, and Vladimir

Crnojevic, Nokia Mobile Data Challenge Workshop 2012.

12  Interdependence and Predictability of Human Mobility and Social Interactions, Manlio de Domenico, Antonio

Lima, and Mirco Musolesi, Nokia Mobile Data Challenge Workshop 2012.

14  See, e.g., Yaniv Altshuler, Nadav Aharony, Michael Fire, Yuval Elovici, Alex Pentland, Incremental Learning with Accuracy Prediction of Social and Individual Properties from Mobile-Phone Data, WS3P, IEEE Social Computing (2012), especially Figures 9 and 10.

15 Admittedly, there are differing views on sources of authority beyond the Constitution that might justify location tracking. See, e.g., Stephanie K. Pell and Christopher Soghoian, Can You See Me Now? Toward Reasonable Standards for Law Enforcement Access to Location Data That Congress Could Enact, 27 Berkeley Tech. L.J. 117 (2012).

16  Kyllo, 533 U.S. at 34.

Maritza Johnson, Tara Whalen & Steven M. Bellovin, The Failure of Online Social Network Privacy Settings II – Policy Implications

Maritza Johnson, Tara Whalen & Steven M. Bellovin, The Failure of Online Social Network Privacy Settings II – Policy Implications

Comment by: Aaron Burstein

PLSC 2011

Workshop draft abstract:

The failure of today’s privacy controls has a number of legal and policy implications.  One concerns the Fourth Amendment.  Arguably, people have a reasonable expectation of privacy in data they have marked “private” on Facebook; conversely, such an expectation is not reasonable if they have made it available to Facebook’s 500,000,000 users.  Our results, though, show that people often cannot carry out their intentions, and that they are unaware of this fact.  Given this, we suggest that a broader view of a reasonable expectation of privacy is necessary.

There are also implications for privacy regulations.  In jurisdictions that regulate collection of data (e.g., Canada and the EU), the existence of access controls could be viewed as a consent mechanism: a user who has marked an item as publicly accessible has voluntarily waived privacy rights.  We assert that such a waiver is not a knowing one, in that people cannot carry out their intentions.

Michelle Madejski, Maritza Johnson & Steven M. Bellovin, A Study of Privacy Setting Errors in Online Social Networks

Michelle Madejski, Maritza Johnson & Steven M. Bellovin, A Study of Privacy Setting Errors in Online Social Networks

Comment by: Aaron Burstein

PLSC 2011

Workshop draft abstract:

Increasingly, people are sharing sensitive personal information via online social networks (OSN). While such networks do permit users to control what they share with whom, access control policies are notoriously difficult to configure correctly; this raises the question of whether users’ privacy settings match their intentions. We present the results of an empirical evaluation that measures privacy attitudes and sharing intentions and compares these against the actual privacy settings on Facebook. Our results indicate a serious mismatch: every one of the 65 participants in our study had at least one sharing violation. In other words, OSN users are sharing more information than they wish to. Furthermore, a majority of users cannot or will not fix such errors. We conclude that the current approach to privacy settings is fundamentally flawed and cannot be fixed; a fundamentally different approach is needed. We present recommendations to ameliorate the current problems, as well as providing suggestions for future research.