Andrea M. Matwyshyn, Talking Data
Comment by: Andrew Selbst
Workshop draft abstract:
In the wake of Sorrell v. IMS Health, open questions remain regarding the limitations on privacy regulation imposed by the First Amendment. A conceptual classification problem that is simultaneously also visible in other bodies of law has crept into the intersection of privacy and the First Amendment: confusion over when (or whether) a data aggregator’s code (and its attached information) is a type of expressive, socially-embedded act of communication or a type of free-standing communicative yet regulable “product.” This article argues that although the statute at issue in Sorrell failed First Amendment scrutiny, privacy regulation which restricts onward transfer of databases of consumer information – even transfers of anonymized data – if carefully crafted, can pass First Amendment scrutiny. Through blending doctrinal First Amendment principles with the fundamental tenets of human subjects research protection imposed by Congress and the Department of Health and Human Services, this article explains the doctrinal limits of the First Amendment on future consumer privacy laws and offers an example of a possible First Amendment-sensitive approach to protecting consumer privacy in commercial databases.
William McGeveran, Privacy and Playlists
Comment by: Felix Wu
Workshop draft abstract:
Social media is not a passing fad. In response to enthusiastic user demand, companies from Amazon to the Washington Post have built “sharing” functionality into their operations, especially online. A boomlet in platforms for socially shared entertainment further underscores the trend – increasingly, we are reading, listening to music, and watching movies among our friends. For example, the popular new music streaming service Spotify, now highly integrated with Facebook, encourages users to notify their online friends of their listening choices and to post playlists for others to use.
This sudden dramatic shift challenges traditional privacy law. Many existing rules assume a data collector who redistributes personally identifiable information to third-party recipients unknown to the data subject, for use in profiling. Spotify (or Facebook or the Washington Post Social Reader) sends information to a user’s friends, not strangers, and does so as a means of creating word of mouth, not of profiling. As I have argued previously, genuine recommendations from one’s friends are immensely valuable, but illegitimate ones can both invade privacy and undermine overall information quality.
This paper considers the appropriate model for regulating privacy in socially shared reading, listening, and viewing. As a case study, it examines recent legislation passed by the House of Representatives and pending in the Senate to amend the Video Privacy Protection Act. Proponents of the legislation argue that it merely modernizes the statute for the social media age. Opponents believe it vitiates one of the only federal laws to directly protect intellectual privacy with an opt-in consent rule.
I conclude that both camps are wrong. The VPPA could and should be updated, but the current bill does not go about it in the right way. More broadly, many of the issues raised in this debate over video apply equally to books, music, web browsing, video gaming, and other pursuits. The paper will make recommendations for appropriate means to address privacy in the social media age.
Felix Wu, Privacy and Utility in Data Sets
Comment by: Jane Yakowitz
Published version available here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2031808
Workshop draft abstract:
Privacy and utility are inherently in tension with one another. Information is useful exactly when it allows someone to have knowledge that he would not otherwise have, and to make inferences that he would not otherwise be able to make. The goal of information privacy is precisely to prevent others from acquiring particular information or from being able to make particular inferences. Moreover, as others have demonstrated recently, we cannot divide the world into “personal” information to be withheld, and “non-personal” information to be disclosed. There is potential social value to be gained from disclosing even “personal” information. And the revelation of even “non-personal” information might provide the final link in a chain of inferences that leads to information we would like to withhold.
Thus, the disclosure of data involves an inherent tradeoff between privacy and utility. More disclosure is both more useful and less private. Less disclosure is both less useful and more private. This does not mean, however, that the disclosure of any one piece of information is no different from the disclosure of any other. Some disclosures may be relatively more privacy invading and less socially useful, or vice versa. The question is how to identify the privacy and utility characteristics of data, so as to maximize the utility of the data disclosed, and minimize privacy loss.
Thus far, at least two different academic communities have studied the question of analyzing privacy and utility. In the legal community, this question has come to the fore with recent work on the re-identification of individuals in supposedly anonymized data sets, as well as with questions raised by the behavioral advertising industry’s collection and analysis of consumer data. In the computer science community, this question has been studied in the context of formal models of privacy, particularly that of “differential privacy.” This paper seeks to bridge the two communities, to help policy makers understand the implications of the results obtained by formal modeling, and to suggest to computer scientists additional formal approaches that might capture more of the features of the policy questions currently being debated. We can and should bring to bear both the qualitative analysis of the law and the quantitative analysis of computer science to this increasingly salient question of privacy-utility tradeoffs.