Monthly Archives: May 2013

Anjali S. Dalal, Administrative Constitutionalism and the Development of the Surveillance State

Anjali S. Dalal, Administrative Constitutionalism and the Development of the Surveillance State

Comment by: Michael Traynor

PLSC 2013

Published version available here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2236502

Workshop draft abstract:

Administrative constitutionalism is a theory that promotes and protects laws that reflect the deliberative process.  It is a flavor of popular constitutionalism that values the multi-year conversations among the various branches of government, levels of government, and the public, reflecting the evolution and slow entrenchment of a set of norms.  For administrative constitutionalists, it is the dialogic process that lends legitimacy to the norm that ultimately evolves and become entrenched.

However, one of the dangers of administrative constitutionalism in practice is that of entrenchment before deliberation. Agencies, in their role as norm entrepreneurs, can develop and, over time, entrench norms before those norms have the opportunity to emerge from the deliberative process.  This situation threatens to create legitimacy based on historical practice and path dependency, not deliberation and consensus building.

This article provides an account of administrative constitutionalism at its best and its worst, by tracing the history of the creation and evolution of the Attorney General Guidelines, the governing document for the FBI.  In particular, this article looks at the defining features that led to the early success and later failure of administrative constitutionalism in practice and attempts to articulate the administrative architecture needed to ensure that the reality of administrative constitutionalism reflects the deliberative and democratic promise of the theory.

The article begins with a brief summary of Eskridge and Ferejohn’s theory of administrative constitutionalism.  Part I provides an account of domestic surveillance law that demonstrates the power and promise of administrative constitutionalism.  In this Part, I trace the growth of the FBI’s domestic surveillance practices from its early years until the development of the FBI’s first governing document, the

Attorney General Guidelines.  This document reflected the tenor of the time and a high point in the FBI’s protection of First Amendment rights in the face of competing national security concerns. Part II provides an account of the subsequent iterations of the Attorney

General Guidelines that illustrates the dangers of agency norm entrepreneurship and entrenchment gone unchecked.  In this Part, I detail the historical evolution of the Guidelines and attendant shift in the balance between free speech and national security, in favor of national security.  This shift occurs with neither the input of the other branches of government nor the public, but is encouraged by the norm entrenchment that follows, reflecting a failure of administrative constitutionalism in practice.  This failure is marked by the return of the three dominant features of the Hoover FBI — the unchecked expansion of the FBI’s mission, the pursuit of mission through illegal or potentially illegal means, and the creation of an intelligence gathering process cloaked in secrecy.  Part III suggests a few structural changes to help architect against future failures of administrative constitutionalism.  In particular, Part III explores the appropriate role of Congressional oversight, judicial intervention, and agency accountability to ensure the success of administrative constitutionalism.

Robert H. Sloan and Richard Warner, Beyond Notice and Choice: Streetlights, Norms, and Online Consent

Robert H. Sloan and Richard Warner, Beyond Notice and Choice: Streetlights, Norms, and Online Consent

Comment by: Robert Gellman

PLSC 2013

Published version available here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2239099

Workshop draft abstract:

Informational privacy is the ability to determine for yourself when and how others may collect and use your information.  We assume there is good reason to ensure adequate informational privacy.  Adequate informational privacy requires a sufficiently broad ability to give or withhold free and informed consent to proposed uses; otherwise, you cannot determine for yourself how others use your information.

Notice and Choice (sometimes also called “notice and consent”) is the current paradigm for consent online. The Notice is a presentation of terms, typically in a privacy policy or terms of use agreement.  The Choice is an action signifying acceptance of the terms, typically clicking on an “I agree” button, or simply using the website.  Recent reports by the Federal Trade Commission explicitly endorse the Notice and Choice approach (and provide guidelines for its implementation). When the Notice contains information about data collection and use, the argument for Notice and Choice rests on two claims. First: a fully adequate implementation of the paradigm would ensure that website visitors can give (or withhold) free and informed consent to data collection and use practices.  Second: the combined effect of all the individual decisions is an acceptable overall tradeoff between privacy and the benefits of collecting and using consumers’ data.  There are (we contend) decisive critiques of both claims.  So why do policy makers and privacy advocates continue to endorse Notice and Choice?

An unsympathetic but not entirely inapt analogy is the old joke about the drunk searching for his keys underneath the streetlight:

A policeman sees a drunken man searching for something under a streetlight and asks the drunk what he lost. He says he lost his keys and they both look under the streetlight together.  After a few minutes the policeman asks if he is sure he lost them here, and the drunk replies, no, that he lost them in the park. “So, why are you looking under the streetlight?” asks the policeman, and the drunk replies, “This is where the light is.”

Policy makers and privacy advocates look under the streetlight of Notice and Choice even though it is clear that the consent is not there.  Why don’t they search more broadly?  Most likely, they see no need to do so.  We find the critique of Notice and Choice conclusive, but our assessment is far from widely shared—and understandably so.  Criticisms of Notice and Choice are scattered over several articles and books.  No one has unified them and answered the obvious counterarguments.  We do so in Section I.  Making the critique plain, however, is not enough to ensure that policy makers turn from the “streetlight” to the “park.” The critiques are entirely negative; they do not offer any alternative to Notice and Choice. They do not direct us to a “park” in which to search for consent.

Drawing on Helen Nissenbaum’s work, we offer an alternative:  informational norms.  Informational norms are social norms that constrain the collection, use, and distribution of personal information.  Such norms explain, for example, why your pharmacist may inquire about the drugs you are taking, but not about whether you are happy in your marriage.  When appropriate informational norms govern online data collection and use, they ensure both that visitors give free and informed consent to those practices, and yield an acceptable overall tradeoff between protecting privacy and the benefits of processing information.  A fundamental difficulty is the lack of norms.  Rapid advances in information processing technology have fueled new business models, and the rapid development has outpaced the slow evolution of norms. Notice and Choice cannot be pressed into service to remedy this lack.  It is necessary to develop new norms, and in later sections of the paper we discuss how to develop new norms.

Latanya Sweeney, Racial Discrimination in Online Ad Delivery

Latanya Sweeney, Racial Discrimination in Online Ad Delivery

Comment by: Margaret Hu

PLSC 2013

Published version available here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2208240

Workshop draft abstract:

Investigating the appearance of online advertisements that imply the existence of an arrest record, this writing chronicles field experiments that measure racial discrimination in ads served by Google AdSense.  A specific company, instantcheckmate.com, sells aggregated public information about individuals in the United States and sponsors ads to appear with Google search results for searches of some exact “firstname lastname” queries. A Google search for a person’s name, such as “Trevon Jones”, may yield a personalized ad that may be neutral, such as “Looking for Trevon Jones? Comprehensive Background Report and More…”, or may be suggestive of an arrest record (Suggestive ad), such as “Trevon Jones, Arrested?…” or “Trevon Jones: Truth. Arrests and much more. … “

Field experiments documented in this writing show racial discrimination in ad delivery based on searches of 2200 personal names across two websites.  First names, documented by others as being assigned primarily to black babies, such as Tyrone, Darnell, Ebony and Latisha, generated ads suggestive of an arrest 75 percent and 96 percent of the time, and names having a first name documented by others as being assigned at birth primarily to whites, such as Geoffrey, Brett, Kristen and Anne, generated more neutral copy: the word “arrest” appeared zero to 9 percent of the time.  A few names did not follow these patterns: Brad, a name predominantly given to white babies, generated a Suggestive ad 62 percent to 65 percent of the time.  All ads return results for actual individuals and Suggestive ads appear regardless of whether the subjects have an arrest record in the company’s database.  Notwithstanding these findings, the company maintains Google received the same ad copy for groups of last names (not first names), raising questions as to whether Google’s algorithm exposes racial bias in society.

Neil M. Richards, Data Privacy and the Right to be Forgotten after Sorrell

Neil M. Richards, Data Privacy and the Right to be Forgotten after Sorrell

Comment by: Lauren Gelman

PLSC 2013

Workshop draft abstract:

This paper takes on the argument that the First Amendment also bars the government from regulating some or all of the commercial trade in personal information.  This argument had some success in the federal appellate courts in the 1990s and 2000s, and the Supreme Court seems to have recently embraced it.  In the 2011 case of Sorrell v. IMS, the Court held that the First Amendment prohibited a state from regulating the use of doctor’s prescribing data for marketing purposes.  These cases present the issue of whether the sale of commercial data is “free speech” or not, and the consequences of this decision for consumers and citizens.  But what is really at stake in Sorrell and in the cases which will inevitably follow it are two very different conceptions of our Information Society.  From the perspective of the First Amendment critics (and possibly a majority of the Roberts Court), flows of commercial information are truthful speech protected by the First Amendment.  This argument has received a boost from the Court’s decision in Sorrell and from the poorly-named and articulated “Right to Be Forgotten” that is on the rise in European data protection circles.  By contrast, privacy scholars and activists have often failed to explain why privacy is worth protecting in these cases, particularly in the face of the constitutional arguments on the other side.

Building on my earlier work on this important question, I argue that the First Amendment critique of data privacy law is largely unpersuasive.  While the First Amendment critique of the privacy torts and some broad versions of the Right to Be Forgotten do threaten important First Amendment interests, the broader First Amendment critique of data privacy protections for consumers does not.  Unlike news articles, blog posts, or even gossip, which are expressive speech by human beings, the commercial trade in personal data uses information as a commodity traded from one computer to another.  The data trade is much more commercial than expressive, and the Supreme Court has long held for good reason that the sale of information is commercial activity that receives very little First Amendment protection.  We must keep faith with this tradition in our law rather than abandoning it.  Moreover, I will show how a more modest form of the Right to be Forgotten can be protected consistent with the First Amendment, and even with Sorrell.  But I argue that we should rethink our use of the general term “privacy” to deal with the commercial trade in personal information.  The use of “privacy” connotes tort-centered notions protecting against extreme emotional harm – a model that poorly tracks the modern trade in personal information, which gives opponents of consumer data protection unnecessary ammunition.  A better way to understand the consumer issues raised by the trade in personal information is the European concept of “data protection,” or perhaps just “data privacy.”  Putting the old tort-focused conception of anti-disclosure away allows us to better understand the problem, and can suggest better solutions.  The regulation of data privacy should focus on managing the flows of commercial personal information and allowing greater consumer input into decisions made on the basis of data about them.

But however we as a society choose to regulate data flows, we should be able to choose.  We should not be sidetracked by misleading First Amendment arguments, because the costs of not regulating the trade in commercial data are significant.  As we enter the Information Age, where the trade in information is a multi-billion dollar industry, government should be able to regulate the huge flows of personal information, as well as the uses to which this information can be put.  At the dawn of the Industrial Age, businesses interests persuaded the Supreme Court in the Lochner case that the freedom of contract should immunize them from regulation.  I explain how and why we should reject the calls of First Amendment critics for a kind of digital Lochner for personal information. It shows how we can have consumer protection law in the Information Age without sacrificing meaningful free speech.

(This paper is an adaptation for PLSC of portions of Chapters 5 and 11 of my forthcoming book, Intellectual Privacy: Rethinking Civil Liberties in the Information Age (forthcoming Oxford University Press 2014)).

Seda Gürses, “Privacy is don’t ask, confidentiality is don’t tell” An empirical study of privacy definitions, assumptions and methods in computer science research and Robert Sprague and Nicole Barberis, An Ontology of Privacy Law Derived from Probabilistic Topic Modeling Applied to Scholarly Works Using Latent Dirichlet Allocation (joint workshop)

Seda Gürses, “Privacy is don’t ask, confidentiality is don’t tell” An empirical study of privacy definitions, assumptions and methods in computer science research and Robert Sprague and Nicole Barberis, An Ontology of Privacy Law Derived from Probabilistic Topic Modeling Applied to Scholarly Works Using Latent Dirichlet Allocation (joint workshop)

Comment by: Helen Nissenbaum

PLSC 2013

Workshop draft abstract:

Since the end of the 60s, computer scientists have  engaged in  research on  privacy and  information  systems.  Over the years, this research has led to a whole  palette  of “privacy solutions”.  These vary from design principles and  privacy  tools,  to the application of privacy enhancing techniques.  These solutions originate  from  diverse  sub-fields of computer  science,  e.g.,  security  engineering, databases,  software  engineering,  HCI,  and  artificial intelligence. From a bird’s  eye view, all of these  researchers are  studying  privacy. However,  a  closer  look  reveals that each  community  of researchers relies  on different,  sometimes  even conflicting, definitions of privacy, and  on a variety of social and  technical assumptions. At best, they are referring to different facets of privacy and,  at worst,  they  fail  to take  into  account  the diversity  of existing definitions  and  to integrate  knowledge  on  the phenomenon generated by  other  communities  (Gürses  and  Diaz,  2013).  Researchers do have  a tradition of assessing  the (implicit) definitions and assumptions that un- derlie the studies in their respective communities (Goldberg, 2002; Patil et al., 2006). However,  a systematic evaluation of privacy  research prac- tice across  the different computer science communities is so far absent. This  paper  contributes to closing this gap through an empirical study of privacy  research in computer  science.  The  focus of the paper  is on the different  notions  of privacy  that the 30 interviewed  privacy  researchers employ, as well as on the dominant worldviews that inform their practice. Through a qualitative analysis  of their responses  using grounded theory we consider  how the researchers framing  of privacy  affects  what  counts as “worthwhile problems” and  “acceptable  scientific  evidence”  in their studies  (Orlikowski and  Baroudi, 1991).  We  further analyze  how  these conceptions of the problem  prestructure the potential solutions to privacy in their  fields (Van  Der Ploeg,  2005).

 

We expect the results to be of interest beyond  the confines of computer science.  Previous studies  on how privacy  is conceived  and  addressed in practice have brought new perspectives to “privacy on the books” (Bamberger and Mulligan,  2010): users’ changing  articulations of privacy  in networked publics  (danah boyd,  2007), the evolution of privacy  as practiced by private organizations (Bamberger and Mulligan,  2010), the conceptualization of privacy  in legal practice (Solove,  2006), or the framing of privacy  in media  coverage  in different cultural contexts (Petrison and Wang,  1995). However,  few studies have turned their gaze on the re- searchers themselves with the objective of providing a critical reflection of the field (Smith et al., 2011). The  few studies that exist in computer science focus on artifacts produced by the researchers, e.g., publications, or provide  an  analysis  of the state-of-the-art written by insiders.  While these are valuable contributions, we expect the comparative and the empirical nature of our study  to provide  deep  and  holistic  insight  into privacy  research in computer science.


Kenneth A. Bamberger  and Deirdre  K. Mulligan.  Privacy  on the Books and on the Ground.  Stanford  Law Review,  63:247 – 316, 2010.

danah   boyd.    Why  Youth  (Heart) Social  Network  Sites:  The  Role  of  Net- worked  Publics   in  Teenage  Social  Life.    The  John  D.  and  Catherine   T. MacArthur Foundation  Series on Digital Media and Learning,  pages 119–142,

2007.   URL  http://www.mitpressjournals.org/doi/abs/10.1162/dmal.9780262524834.119.

Ian  Goldberg.   Privacy-enhancing technologies  for the Internet, II: Five  years later.  In Proc.  of PET 2002, LNCS 2482, pages 1–12. Springer,  2002.

Seda  Gu¨rses  and  Claudia  Diaz.   An activist  and  a consumer  meet  at  a social network …  IEEE  Security  and Privacy  (submitted), 2013.

Wanda  J. Orlikowski and Jack. J. Baroudi.  Studying Information Technology in Organizations: Research  Approaches  and Assumptions.  Information Systems Research,  2(1):1 – 28, 1991.

Sameer Patil,  Natalia  Romero, and John Karat. Privacy  and HCI: Methodologies for studying privacy issues. In CHI ’06 Extended  Abstracts  on Human  Factors in Computing  Systems, CHI EA ’06, pages 1719–1722, New York, NY, USA,

2006. ACM.  ISBN 1-59593-298-4. doi: 10.1145/1125451.1125771.  URL http://doi.acm.org/10.1145/1125451.1125771.

Lisa A. Petrison and Paul Wang.  Exploring  the dimensions of consumer privacy: an analysis of coverage in british and american  media.   Journal of Direct Marketing,  9(4):19–37, 1995.  ISSN 1522-7138.  doi: 10.1002/dir.4000090404. URL http://dx.doi.org/10.1002/dir.4000090404.

H.  Jeff Smith,  Tamara Dinev,  and  Heng  Xu.   Information Privacy  Research: An interdisciplinary review.  MIS  Quarterly,  35(4):989–1016, December  2011.

ISSN  0276-7783.        URL   http://dl.acm.org/citation.cfm?id=2208940.2208950.

Daniel  J. Solove.   A Taxonomy   of Privacy.    University   of  Pennsylvania  Law Review,  154(3), January 2006.

Irma  Van  Der Ploeg.  Keys  To Privacy.  Translations of “the privacy  problem” in Information Technologies, pages 15–36. Maastricht: Shaker,  2005.

Alan Rubel and Ryan Biava, A Framework for Comparing Privacy States

Alan Rubel and Ryan Biava, A Framework for Comparing Privacy States

Comment by: Judith DeCew

PLSC 2013

Workshop draft abstract:

This paper develops a framework for analyzing and comparing privacy and privacy protections across (inter alia) time, place, and polity and for examining factors that affect privacy and privacy protection. This framework provides a way to describe precisely aspects of privacy and context and a flexible vocabulary and notation for such descriptions and comparisons. Moreover, it links philosophical and conceptual work on privacy to social science and policy work and accommodates different conceptions of the nature and value of privacy. The paper begins with an outline of the framework. It then refines the view by describing a hypothetical application. Finally, it applies the framework to a real-world privacy issue—campaign finance disclosure laws in the U.S. and in France. The paper concludes with an argument that the framework offers important advantages to privacy scholarship and for privacy policy makers.

Buzz Scherr, Genetic Privacy and Police Practices

Buzz Scherr, Genetic Privacy and Police Practices

Comment by: Paul Frisch

PLSC 2013

Workshop draft abstract:

Genetic privacy and police practices have come to the fore in the criminal justice system.  Case law and stories in the media document that police are surreptitiously harvesting the out-of-body DNA of putative suspects. Some sources even indicate that surreptitious data banking may also be in its infancy. Surreptitious harvesting of out-of-body DNA by the police is currently unregulated by the Fourth Amendment.  The few courts that have addressed the issue find that the police are free to harvest DNA abandoned by a putative suspect in a public place. Little in the nascent surreptitious harvesting case law suggests that surreptitious data banking would be regulated either under current judicial conceptions of the Fourth Amendment.

The surreptitious harvesting courts have misapplied the Katz reasonable-expectation-of-privacy test recently reaffirmed in U.S. v. Jones by the Supreme Court.  They have taken a mistakenly narrow property-based approach to their analyses.  Given the potential for future abuse of the freedom to collect anyone’s out-of-body DNA without even a hunch, this article proposes that the police do not need a search warrant or probable cause to seize an abandoned item in or on which cells and DNA exist.  But, they do need a search warrant supported by probable cause to enter the cell and harvest the DNA.

An interdisciplinary perspective on the physical, informational and dignitary dimensions of genetic privacy suggests that an expectation of privacy exists in the kaleidoscope of identity that is in out-of-body DNA. Using linguistic theory on the use of metaphors, the article also examines the use of DNA metaphors in popular culture as a reference point to explain a number of features of core identity in contrast to the superficiality of fingerprint metaphors.  Popular culture’s frequent uses of DNA as a reference point reverberate in a way that suggests that society does recognize as reasonable an expectation of privacy in DNA.

Marc Blitz, The Law and Political Theory of “Privacy Substitutes”

Marc Blitz, The Law and Political Theory of “Privacy Substitutes”

Comment by: Ian Kerr

PLSC 2013

Workshop draft abstract:

The article explores the question of when the government officials should in some cases be permitted to take measures that lessen individuals’ informational privacy – on the condition that they in some sense compensate for it “in kind” – either by (i) recreating this privacy in a different form or (ii) providing individuals with some other kind of legal protection which assures, for example, the information disclosed by the government will not be used to impose other kinds of harm.

My aim in the article is to make three points.  First, I explore the ways in which the concept of a privacy substitute already plays a role in at least two areas of Fourth Amendment law:

  1. The case law on “special needs” and administrative searches, which discusses when “constitutionally adequate substitute[s]” for a warrant (to use the language of New York v. Burger (1987)) or statutory privacy protections (such as those in the DNA act), may compensate for the absence of warrant- or other privacy safeguards and
  2. cases holding that certain technologies which allow individuals to gather information from a private environment (such as a closed container) might be deemed “non-searches” if the technologies have built-in limitations assuring that they do not gather information beyond that information about the presence of contraband material or other information in which there is no “reasonable expectation of privacy” under the Fourth Amendment.

In each of these cases, I argue, courts have relied on certain assumptions – some of them problematic – about when certain kinds of statutory, administrative, or technological privacy protections may be substituted for more familiar constitutional privacy protections such as warrant requirements.

Second, I argue that, while such cases have sometimes set the bar too low for government searches, “privacy substitutes” of this sort can and should play a role in Fourth Amendment jurisprudence, and also perhaps in First Amendment law on anonymous speech and other constitutional privacy protections.  In fact, I will argue, there are situations where technological developments may make such “privacy substitutions” not merely helpful to saving certain government measures from invalidation, but essential for replacing certain kinds of privacy safeguards that would otherwise fall victim to technological changes (such as advances in location tracking and video surveillance technology which undermine the features of the public environment individuals could previously rely upon to find privacy in public settings).

Third, focusing on the example of protections for anonymous speech in First Amendment law, I explore under what circumstances government should, in some cases, be permitted to replace privacy protections not with new kinds of privacy protection, but rather with other legal measures that serve the same end — for example, measures that provide the liberty, or sanctuary from retaliation, that privacy is sometimes relied upon for.

Lauren E. Willis, Why Not Privacy by Default?

Lauren E. Willis, Why Not Privacy by Default?

Comment by: Michael Geist

PLSC 2013

Workshop draft abstract:

We live in a Track-Me world.   Firms collect reams of personal data about all of us, for marketing, pricing, and other purposes.  Most people do not like this.  Policymakers have proposed that people be given choices about whether, by whom, and for what purposes their personal information is collected and used.  Firms claim that consumers already can opt out of the Track-Me default, but that choice turns out to be illusory.  Consumers who attempt to exercise this choice find their efforts stymied by the limited range of options firms actually give them and technology that bypasses consumer attempts at self-determination.  Even if firms were to provide consumers with the option to opt out of tracking completely and to respect that choice, opting out would likely remain so cumbersome as to be impossible for the average consumer.

In response, some have suggested switching the default rule, such that firms (or some firms) would not be permitted to collect (or collect in some manners) and/or use (or use for some purposes) personal data (or some types of personal data) unless consumers opt out of a “Do-Not-Track” default.  Faced with this penalty default, firms ostensibly would be forced to clearly explain to consumers how to opt out of the default and to justify to consumers why they should opt into a Track-Me position.  Consumers could then, the reasoning goes, freely exercise informed choice in selecting whether to be tracked.

Industry vigorously opposes a Do-Not-Track default, arguing that Track-Me is the better position for most consumers and that the positive externalities created by tracking justify keeping that as the default, if not unwaivable, position.  Some privacy advocates oppose both Track-Me and Do-Not-Track defaults on the grounds that the negative externalities created by tracking justify refusing to allow any consumers to consent to tracking at all.

Here I caution against the use of a Do-Not-Track default on different grounds.  Lessons from the experience of consumer-protective defaults in other realms counsel that a Do-Not-Track default is likely to be slippery.  The very same transaction barriers and decisionmaking biases that can lead consumers to stick with defaults in some situations can be manipulated by firms to induce consumers to opt out of a Do-Not-Track default.  Rather than forcing firms to clearly inform consumers of their options and allowing consumers to exercise informed choice, a Do-Not-Track default will provide firms with opportunities to confuse many consumers into opting out.  Once a consumer opts out of a default position, courts, commentators, and the consumer herself are more likely to blame the consumer for any adverse consequences that might befall her.  The few sophisticated consumers who are able to effectively control whether they are tracked will benefit, but at the expense of the majority who will lack effective self-determination in this realm.  A Do-Not-Track default might be a necessary policy way station en route to a scheme of privacy-protective mandates for political reasons, but it also might defuse the political will to implement such a scheme without meaningfully changing the lack of choice inherent in today’s Track-Me world.

I use “track” to mean all forms of personal data collection and use beyond those that are reasonably expected for the immediate transaction at hand.  So, for example, a consumer who provides her address to her bank expects it to be used for the purposes of mailing her information about her accounts, but does not expect it to be used to decide whether or at what price to offer her auto insurance.

Steven M. Bellovin, Renée M. Hutchins, Tony Jebara, Sebastian Zimmeck, When Enough is Enough: Location Tracking, Mosaic Theory and Machine Learning

Steven M. Bellovin, Renée M. Hutchins, Tony Jebara, Sebastian Zimmeck, When Enough is Enough: Location Tracking, Mosaic Theory and Machine Learning

Comment by: Orin Kerr

PLSC 2013

Published version available here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2320019

Workshop draft abstract:

Since 1967, the Supreme Court has tied our right to be free of unwanted government scrutiny to the concept of reasonable expectations of privacy.5 Reasonable expectations include, among other things, an assessment of the intrusiveness of government action. When making such assessments historically, the Court has considered police conduct with clear temporal, geographic, or substantive limits. However, in an era where new technologies permit the storage and compilation of vast amounts of personal data, things are becoming more complicated. A school of thought known as “Mosaic Theory” has stepped into the void, ringing the alarm that our old tools for assessing the intrusiveness of government conduct potentially undervalue our privacy rights.

Mosaic theorists advocate a cumulative approach to the evaluation of data collection.

Under the theory, searches are “analyzed as a collective sequence of steps rather than as individual steps.”6 The approach is based on the recognition that comprehensive aggregation of even seemingly innocuous data reveals greater insight than consideration of each piece of information in isolation. Over time, discrete units of surveillance data can be processed to create a mosaic of habits, relationships, and much more. Consequently, a Fourth Amendment analysis that focuses only on the government’s collection of discrete units of trivial data fails to appreciate the true harm of long-term surveillance—the composite.

In the context of location tracking, the Court has previously suggested that the Fourth Amendment may (at some theoretical threshold) be concerned with the accumulated information revealed by surveillance.7 Similarly, in the Court’s recent decision in United States v. Jones, a majority of concurring justices indicated willingness to explore such an approach. However, in the main, the Court has rejected any notion that technological enhancement matters to the constitutional treatment of location tracking.8 Rather the Court has found that such surveillance in public spaces, which does not require physical trespass, is equivalent to a human tail and thus not regulated by the Fourth Amendment. In this way, the Court has avoided quantitative analysis of the amendment’s protections.

The Court’s reticence is built on the enticingly direct assertion that objectivity under the mosaic theory is impossible. This is true in large part because there has been no rationale yet offered to objectively distinguish relatively short-term monitoring from its counterpart of greater duration. As Justice Scalia writing for the majority in United States v. Jones, recently observed: “it remains unexplained why a 4-week investigation is ‘surely’ too long.”9 This article answers that question for the first time by combining the lessons of machine learning with mosaic theory and applying the pairing to the Fourth Amendment.

Machine learning is the branch of computer science concerning systems that can draw inferences from collections of data, generally by means of mathematical algorithms. In a recent competition called “The Nokia Mobile Data Challenge,”10 researchers evaluated machine learning’s applicability to GPS and mobile phone data. From a user’s location history alone, the researchers were able to estimate the user’s gender, marital status, occupation and age.11

Algorithms developed for the competition were also able to predict a user’s likely future position by observing past location history. Indeed, a user’s future location could even be inferred with a relative degree of accuracy using the location data of friends and social contacts.12

Machine learning of the sort on display during the Nokia Challenge seeks to harness with artificial intelligence the data deluge of today’s information society by efficiently organizing data, finding statistical regularities and other patterns in it, and making predictions therefrom. It deduces information—including information that has no obvious linkage to the input data—that may otherwise have remained private due to the natural limitations of manual and human-driven investigation. Analysts have also begun to “train” machine learning programs using one dataset to find similar characteristics in new datasets. When applied to the digital “bread crumbs” of data generated by people, machine learning algorithms can make targeted personal predictions. The greater the number of data points evaluated the greater the accuracy of the algorithm’s results.

As this article explains, technology giveth and technology taketh away. The objective understanding of data compilation that is revealed by machine learning provides important Fourth Amendment insights. We should begin to consider these insights more closely.

In four parts, this article advances the conclusion that the duration of investigations is relevant to their substantive Fourth Amendment treatment because duration affects the accuracy of the generated composite. Though it was previously difficult to explain why an investigation of four weeks was substantively different from an investigation of four hours, we now can. As machine learning algorithms reveal, composites (and predictions) of startling accuracy can be generated with remarkably few data points. Furthermore, in some situations accuracy can increase dramatically above certain thresholds. For example, a 2012 study found the ability to deduce ethnicity improved slowly through five weeks of phone data monitoring, jumped sharply to a new plateau at that point, and then increased sharply again after twenty-eight weeks. More remarkably, the accuracy of identification of a target’s significant other improved dramatically after five days’ worth of data inputs.14 Experiments like these support the notion of a threshold, a point at which it makes sense to draw a line.

The results of machine learning algorithms can be combined with quantitative privacy definitions. For example, when viewed through the lens of k-anonymity, we now have an objective basis for distinguishing between law enforcement activities of differing duration. While reasonable minds may dispute the appropriate value of k or may differ regarding the most suitable minimum accuracy threshold, this article makes the case that the collection of data points allowing composites or predictions that exceed selected thresholds should be deemed unreasonable searches in the absence of a warrant.15 Moreover, any new rules should take into account not only the data being collected but also the foreseeable improvement in the machine learning technology that will ultimately be brought to bear on it; this includes using future algorithms on older data.

In 2001, the Supreme Court asked “what limits there are upon the power of technology to shrink the realm of guaranteed privacy.”16 In this piece, we explore what lessons there are in the power of technology to protect the realm of guaranteed privacy. The time has come for the Fourth Amendment to embrace what technology already tells us—a four-week investigation is surely too long because the amount of data collected during such an investigation creates a highly intrusive view of person that, without a warrant, fails to comport with our constitutional limits on government.

                                                                                                                                                                                                    

1  Professor, Columbia University, Department of Computer Science.

2  Associate Professor, University of Maryland Carey School of Law.

3  Associate Professor, Columbia University, Department of Computer Science.

4 Ph.D. candidate, Columbia University, Department of Computer Science.

5  Katz v. United States, 389 U.S. 347, 361 (1967) (Harlan, J., concurring).

6  Orin Kerr, The Mosaic Theory of the Fourth Amendment, 111 Mich. L. Rev. 311, 312 (2012).

7 United States v. Knotts, 460 U.S. 276, 284 (1983).

8  Compare Knotts, 460 U.S. at 276 (rejecting the contention that an electronic beeper should be treated differently than a human tail) and Smith v. Maryland, 442 U.S. 735, 744 (1979) (approving the warrantless use of a pen register in part because the justices were “not inclined to hold that a different constitutional result is required because the telephone company has decided to automate.”) with Kyllo v. United States, 533 U.S. 27, 33 (2001) (recognizing that advances in technology affect the degree of privacy secured by the Fourth Amendment).

9  United States v. Jones, 132 S.Ct. 945 (2012); see also Kerr, 111 Mich. L. Rev. at 329-330.

10  See http://research.nokia.com/page/12340.

11  Demographic Attributes Prediction on the Real-World Mobile Data, Sanja Brdar, Dubravko Culibrk, and Vladimir

Crnojevic, Nokia Mobile Data Challenge Workshop 2012.

12  Interdependence and Predictability of Human Mobility and Social Interactions, Manlio de Domenico, Antonio

Lima, and Mirco Musolesi, Nokia Mobile Data Challenge Workshop 2012.

14  See, e.g., Yaniv Altshuler, Nadav Aharony, Michael Fire, Yuval Elovici, Alex Pentland, Incremental Learning with Accuracy Prediction of Social and Individual Properties from Mobile-Phone Data, WS3P, IEEE Social Computing (2012), especially Figures 9 and 10.

15 Admittedly, there are differing views on sources of authority beyond the Constitution that might justify location tracking. See, e.g., Stephanie K. Pell and Christopher Soghoian, Can You See Me Now? Toward Reasonable Standards for Law Enforcement Access to Location Data That Congress Could Enact, 27 Berkeley Tech. L.J. 117 (2012).

16  Kyllo, 533 U.S. at 34.