A. Michael Froomkin, Privacy Impact Notices

A. Michael Froomkin, Privacy Impact Notices

Comment by: Stuart Shapiro

PLSC 2013

Workshop draft abstract:

The systematic collection of personal data is a big and urgent problem, and the pace of that collection is accelerating as the cost of collection plummets.  Worse, the continued development of data processing technology means that this data can be used and cross-indexed increasingly effectively and cheaply.  Add in the fact the that there is more and more historical data — and self-reported data — to which the sensor data can be linked, and we will soon find ourselves in the equivalent of a digital goldfish bowl.

It is time – or even past time – to do something.  In this paper I suggest we borrow home-grown solutions from US environmental law.   By combining the best features of a number of existing environmental laws and regulations, and — not least — by learning from some of their mistakes, we can craft rules about data collection that would go some significant distance towards stemming the tide of privacy-destroying technologies being, and about to be, deployed.

I propose that we should require Privacy Impact Notices (PINs) before allowing large public or private projects which risk having a substantial impact on personal information privacy or on privacy in public. [“Privacy Impact Statements” would make for better parallelism with Environmental Impact Statements but the plural form of the acronym would be unfortunate.] The PINs requirement would be modeled on existing environmental laws, notably the National Environmental Policy Act of 1969 (NEPA), the law that called into being  the Environmental Impact Statement (EIS).  A PINs rule would be combined with other reporting requirements modeled on the Toxics Release Inventory (TRI). It would also take advantage of progress in ecosystem modeling, particularly the insight that complex systems like ecologies, whether of living things or the data about them, are dynamic systems that must be re-sampled over time in order to understand how they are changing and whether mitigation measures or legal protections are working.

The overarching goals of this regulatory scheme are familiar ones from environmental law and policy-making: to inform the public of decisions being considered (or made) that affect it, to solicit public feedback as plans are designed, and to encourage decision-makers to consider privacy — and public opinion — from an early stage in their design and approval processes.  That was NEPA’s goal, however imperfectly achieved. In addition, however, because the relevant technologies change quickly, and because the accumulation of personal information by those gathering data can have unexpected synergistic effects as we learn new ways of linking previously disparate data sets, we now know from the environmental law and policy experience that it is also important to invest effort in on-going, or at least annual, reporting requirements in order to allow the periodic re-appraisal of the legitimacy and net social utility of the regulated activity (here, data collection programs).

There is an important threshold issue. Privacy regulation today differs from contemporary environmental regulation in one particularly important way: there are relatively few data privacy (or privacy-in-public) -protective laws and rules on the books.  Thus, privacy law today more resembles anti-pollution law before the Clean Air Act or the Clean Water Act. NEPA’s rules are triggered by state action: a government project, or a request to issue a permit.  In order to give the PINs system traction outside of direct governmental data collection, additional regulation reaching private conduct will be required.  That could be direct regulation of large private-sector data gathering or, as a first step, it could be something less effective but easier to legislate such as a rule reaching all government contractors and suppliers.  Legislation could be federal, but it might also be effective at the state level.

The proposals in this paper intersect with active and on-going debates over the value of notice policies.  They build on, but in at least one critical way diverge from, the work of Dennis D. Hirsch, who in 2006 had the important insight — even truer today — that many privacy problems resemble pollution problems and that therefore privacy-protective regulation could profitably be based on the latest learning from environmental law.

Michael Froomkin, Lessons Learned Too Well

Michael Froomkin, Lessons Learned Too Well

Comment by: Anne McKenna

PLSC 2012

Published version available here:

Workshop draft abstract:

A decade ago the Internet was already subject to a significant degree of national legal regulation.  This first generation of internet law was somewhat patchy and often reactive.  Some legal problems were solved by simple categorization, whether by court decisions, administrative regulation, or statute.  Other problems required new approaches: the creation of new categories or new institutions.  And in some cases, governments in the US and elsewhere brought out the big guns of direct legislation, sometimes with stiff penalties.

The past decade has seen the crest of the first wave of regulation and the gathering of a second, stronger, wave based on a better understanding of the Internet and of law’s ability to shape and control it.  Aspects of this second wave are encouraging: Internet regulation is increasingly based on a sound understanding of the technology, minimizing pointless rules or unintended consequences. But other aspects are very troubling: where a decade ago it was still reasonable to see the Internet technologies as empowering and anti-totalitarian, now regulators in both democratic and totalitarian states have learned to structure rules that previous techniques cannot easily evade, leading to previously impossible levels of regulatory control.

On balance, that trend seems likely to continue.  One result that seems likely to follow from current trends in centralization and smarter and more global regulation is legal restriction, and perhaps the prohibition, of online anonymity.  As a practical matter, the rise of identification technologies combined with commercial and regulatory incentives have made difficult for any but sophisticated users to remain effectively anonymous.  First wave internet regulation could not force the identification of every user and packet, but the second wave regulation is more international, more adept, and benefits from technological change driven by synergistic commercial and regulatory objectives.  Law which harnesses technology to its ends achieves far more than law regulating outside technology or against it.

The consequences of an anonymity ban are likely to be negative. This paper attempts to explain how we came to this pass, and what should be done to avoid making the problem worse.

Part One of this article discusses the first wave of Internet regulation, before the year 2000, focusing on US law.  This parochial focus is excusable because even at the start of the 21st Century a disproportionate number of Internet users were in the US.  And, with only a very few exceptions . the greatest of which involve aspects of privacy law emanating from the EU’s Privacy Directive . the US either led or at least typified most of the First Wave regulatory developments.

The second wave of regulation has been much more global, so in Part Two, which concerns the most recent decade, the paper’s focus expands geographically, but narrows to specifically anonymity-related developments.  Part A describes private incentives and initiatives that resulted in the deployment of a variety of technologies and private services each of which is unfriendly to anonymous communication.  Part B looks at three types of government regulation, relevant to anonymity: the general phenomenon of chokepoint regulation, and the more specific phenomena of online identification requirements and data retention (which can be understood as a special form of identification).

Part Three examines competing trends that may shape the future of anonymity regulation.  It takes a pessimistic view of the likelihood that given the rapid pace of technical and regulatory changes the fate of online anonymity in the next decade will be determined by law rather than by the deployment of new technologies or, most likely, pragmatic political choices.  It therefore offers normative and pragmatic arguments why anonymity is worth preserving and concludes with questions that proponents of further limits on anonymous online speech should be expected to answer.

Goaded by factors ranging from traditional public order concerns to fear of terrorism and hacking to public disclosures by WikiLeaks and others, both democratic and repressive governments are increasingly motivated to attempt to identify the owners of every packet online, and to create legal requirements that will assist in that effort.  Yet whether a user can remain anonymous or must instead use tools that identify him is fundamental to communicative freedom online.  One who can reliably identify speakers and listeners can often tell what they are up to even if he is not able to eavesdrop on the content of their communications; getting the content makes the intrusion and the potential chilling effects that much greater.  Content industries with copyrights to protect, firms with targeted ads to market, and governments with law enforcement and intelligence interests to protect all now appreciate the value of identification, and the additional value of traffic analysis, not to mention the value of access to content on demand . or even the threat of it.

Online anonymity is closely related to a number of other issues that contribute to communicative freedom, and thus enhance civil liberties, such as the free use of cryptography, and the use of tools designed to circumvent online censorship and filtering.  One might reasonably ask why, then this essay concentrates on anonymity, and on its inverse, identification technologies. The reason is that anonymity is special, arguably more essential to online freedom than any other tool except perhaps cryptography (and one of the important functions of cryptography is to enable or enhance anonymity as well as communications privacy).  Without the ability to be anonymous, the use of any other tool, even encrypted communications, can be traced back to the source.  Gentler governments may use traffic analysis to piece together networks of suspected dissidents, even if the government cannot acquire the content of their communications.  Less-gentle governments will use less-gentle means to pressure those whose communications they acquire and identify. Whether or not the ability to be anonymous is sufficient to permit circumvention of state-sponsored communications control, it is necessary to ensure that those who practice circumvention in the most difficult circumstances have some confidence that they may survive it.

Colin J. Bennett: In Defense of Privacy: The Concept and the Regime

Colin J. Bennett: In Defense of Privacy: The Concept and the Regime

Comment by: Michael Froomkin

PLSC 2011

Workshop draft abstract:

For many years those scholars interested in the nature and effects of “surveillance” have been generally critical of “privacy” as a concept, as a way to frame the political and social issues, and as a regime of governance. “Privacy” and all that it entails is considered too narrow, too based on liberal assumptions about subjectivity, too implicated in rights-based theory and discourse, insufficiently sensitive to the discriminatory aspects of surveillance, culturally relative, overly embroiled in spatial metaphors about “invasion” and “intrusion,” and ultimately practically ineffective.

On closer examination, however, I suggest that the critiques of privacy are quite diverse, and often based on some faulty assumptions about the contemporary framing of the privacy issue, and about the implementation of privacy protection policy.  Some critiques are pitched at a conceptual level; others focus on practice.  There is a good deal of overstatement, and a certain extent to which “straw men” are constructed for later demolition.

The aim of this paper is to disentangle the various critiques and to subject each to a critical analysis. Despite the fact that nobody can supply a precise and commonly accepted definition, privacy maintains an enormous popular appeal, in the English-speaking world and beyond.  It attaches to a huge array of policy questions, to a sprawling policy community, to a transnational advocacy network, to an academic literature and to a host of polemical and journalistic commentary.  Furthermore, its meaning has gradually changed to embrace a more collective understanding of the broader set of social problems. The broader critique from surveillance scholars tends to be insensitive to these conceptual developments, as well as to what members of the policy community actually do.

Paul Ohm, The Probability of Privacy

Paul Ohm, The Probability of Privacy

Comment by: Michael Frommkin

PLSC 2009

Workshop draft abstract:

Data collectors and aggregators defend themselves against claims that they are invading privacy by invoking a verb of relatively recent vintage—“to anonymize.” By anonymizing the data—by removing or replacing all of the names or other personal identifiers—they argue that they are negating the risk of any privacy harm. Thus, Google anonymizes data in its search query database after nine months; proxy email and web browsing services promise Internet anonymity; and network researchers trade sensitive data only after anonymizing them first.

Recently, two splashy news stories revealed that anonymization is not all it is cracked up to be. First, America Online released twenty million search queries from 650,000 users. Next, Netflix released a database containing 100 Million movie ratings from nearly 500,000 users. In both cases, the personal identifiers in the databases were anonymized, and in both cases, researchers were able to “deanonymize” or “reidentify” at least some of the people in the database.

Even before these results, Computer Scientists had begun to theorize deanonymization. According to this research, none of which has yet been rigorously imported into legal scholarship, the utility and anonymity of data are linked. The only way to anonymize a database perfectly is to strip all of the information from it; any database which is useful is also imperfectly anonymous; the more useful a database, the easier it is to reidentify the personal information in the database.

This Article takes a comprehensive look at both claims of anonymization and theories of reidentification, weaving them into law and policy. It compares online and data privacy with anonymization standards and practices in health policy, where these issues have been grappled with for decades.

The Article concludes that claims of anonymization should be viewed with great suspicion. Data is never “anonymized,” and it is better to speak of “the probability of privacy” of different practices. Finally, the Article surveys research into how to reduce the risk of reidentification, and it incorporates this research into a set of prescriptions for various data privacy laws.