Close to a year ago, two scientists reported a serious violation of privacy policies among prominent social networks. They studied Facebook, MySpace, Twitter, LiveJournal, LinkedIn and seven other social networks. Contrary to the privacy policies that each of them adopted, they were leaking personal information to third-party servers that specialize in aggregating internet data for commercial purposes.* The researchers' report is unusually clear and succinct. Its conclusions are damning. Yet, at that time, news organizations paid no attention.
The report is On the Leakage of Personally Identifiable Information Via Online Social Networks (pdf), Aug 2009, by Balachander Krishnamurthy and Craig E. Wills. The credentials of the authors are good. One works at AT&T Labs in the Research Dept. and other is from Worcester Polytechnic Institute in Mass.
Looking back, now, I can find only one instance in which the report was picked up and disseminated: Social network privacy study finds identity link to cookies (Aug 2009). Nine months have gone by and, I guess it's fortunate for us, a major news outlet has finally taken notice: Facebook, MySpace Confront Privacy Loophole by Emily Steel and Jessica E. Vascellaro, Wall Street Journal, May 21. 2010. And fortunate also that other news sources have begun to play catch-up. My favorite of these: The billionaire Facebook founder making a fortune from your secrets (though you probably don't know he's doing it).
The original report is worth reading. Do spend ten minutes of your time on it. It gives the social networking sites the benefit of the doubt in guessing that poor coding practice rather than devil-may-care greed was the reason personal data became exposed to third parties. It's really not difficult to mask that data so it's disheartening, but I guess not surprising, that the researchers gave their findings to the social networking sites last August and none then responded. Only now, contacted by the authors of the WSJ article, have they, for the most part, claimed to have fixed or be in the process of fixing the problems.
It also doesn't surprise that, as you've no doubt noticed, the press is now reporting that Facebook is expected soon to announce an abrupt about-face in its privacy policies.
*Notice that the report says
Although we focus on OSNs [online social networks, like Facebook] in this study, it should be obvious that the manner of leakage could affect users who have accounts and PII [personally identifiable information] on other sites. Sites related to ecommerce, travel, and news services, maintain information about registered users. Some of these sites do use transient session-specific identifiers, which are less prone to identifying an individual compared with persistent identifiers of OSNs. Yet, the sites may embed pieces of PII such as email addresses and location within cookies or Request-URIs. We have carried out a preliminary examination of several popular commercial sites for which we have readily available access. These include books, newspaper, travel, micropayment, and e-commerce sites. We identified a news site that leaks user email addresses to at least three separate thirdparty aggregators. A travel site embeds a user’s first name and default airport in its cookies, which is therefore leaked to any third-party server hiding within the domain name of the travel site. By and large we did not observe leakage of user’s login identifier via the Referer header, the Cookie, or the Request-URI. It should be noted that even if the user’s identifier had leaked, the associated profile information about the user will not be available to the aggregator without the corresponding password. Our preliminary examination should not be taken as the final answer on this issue. A thorough understanding of the scope of the problem along with steps for preventing leakage in general remains a primary concern. Any protection technique must effectively ensure de-identification between a user’s identity prior to any external communication on any site that requires logging in—OSN or otherwise.