Saturday, February 11, 2006

when Google is not your friend

I've been watching blog-traffic on Google's threat to our privacy. Like Nick, I want to enjoy the vast potential for connecting with others that the internet provides.
{Today's Google logo celebrates the opening of the Winter Olympics}

An example, I signed up for LibraryThing at the beginning and have participated in some of the many opportunities for opening new dialogs that it provides. Other LT users have contacted my about my LT persona and my catalog of books. I've contributed to the bountiful discussions of LT's Google Group. I've observed the interconnectedness of my library with others'. I've written reviews, comments, and notes on my catalog entries and read those of others. These activities don't just bring me a bit of diversion; this kind of making contact enriches my life.

There's much more of this kind of thing that I could do but don't. I stand aloof partly because there's only so much time and partly because, as the Miranda declaration says, what I say can and will be used against me. I used to think what could possibly happen; I haven't done anything wrong and anyway my real name and personal details don't appear anywhere in my exchanges via the net.

I still largely think that's true: I'm innocent and thus not of any interest to the law enforcement community and I've protected my identity to protect me against people who would do me harm. So what's to prevent me from plunging in and fully exploiting all the interaction that's available to me?

However .... I'm also painfully aware that innocent behavior can be made to appear to be criminal (or at least compromised; or perhaps associated with the possible criminal behavior of others). And also that my personal details can be pieced together without my control. To take an obvious case, the state makes available information about my ownership of real property. In fact there are lots of governmental and corporate databases that have lots of details about me. They're mostly (but not all) restricted from making the information available (e.g., my bank can't share information about me; the exceptions are such things as the real estate example).

But these restrictions are not air-tight (the Patriot Act, actions of the NFC, and other reports of government snooping show that this is so). And many databases are not adequately protected against intruders (as the theft of credit card info shows -- see addendum at bottom for a couple of other examples).

And there are things we don't know about the future. Ten years ago many, many of us treated the Web as a safe haven -- and then it was. How many of us thought then about what might develop later, as has now occured? And if we recognized the threat, did we fully understand that so much of our Web experience would persist, data storage costs becoming so cheap that the trivial conversations on Usenet groups in the '90s would be preserved and fully searchable in '06?

At least in China Web users know where they stand: step out of line and you'll end up in jail. I can't imagine totalitarian control to such an extent in the US, but there are lots of things that will come about that I cannot now imagine. Something like that could occur here, may be occuring now.

All this is preamble to an article on one piece of the privacy-risk puzzle: What vulnerability results from our use of Google and the other search engines? What use can be made of the databases containing search histories that Google and the other search engines maintain?

The article is by Danny Sullivan on SearchEngineWatch, and it's a good one. Here are some extracts:
Which Search Engines Log IP Addresses & Cookies -- And Why Care?


Last week I wrote how John Battelle followed up with Google to find out if they can link search data to IP addresses or cookies. Google said yes. I wrote that wasn't surprising. I covered back in 2003 how this is standard information any web server is likely to log, including servers at the major search engines. I also wrote last week that if Google is doing this, it was fair to assume all the major search engines are.

Rather than assume, did an actual survey of this. Verbatim: Search firms surveyed on privacy has the rundown of AOL, Google, MSN and Yahoo (Ask Jeeves unfortunately was not included). Yes, they all log this information. AOL says they don't in one instance, but I'll debunk that later.

First, let's go back to the bigger question of why suddenly people are asking about IP addresses and cookies.

Every time you go to a web site, you leave behind an IP address. This is like your internet telephone number, and it's possible (especially with the help of your ISP) to trace activity back to you. That 2003 article of mine, Search Privacy At Google & Other Search Engines, explains this in more detail.

Often, a web site will also assign you a cookie. This is simply a way for your browser to communicate to the web site that you've been there before (not you personally -- such as your name and address -- but you as in a particular web browser software like Internet Explorer or Firefox).

Cookies are better than IP addresses for tracking purposes, because your IP address will often change from internet surfing session to session. Your cookie stays the same, as long as you use the same browser on the same computer and don't delete it.

John's reader wanted to know if search queries at Google could be linked to an IP address or a cookie. Huh? What? Why care?

OK, let's say the government of BigBrother wants to know how many people are looking for something illegal, such as Widagra. Let's say Widagra is a drug legal in some countries but which BigBrother deems evil. If you are even remotely interested in this drug, BigBrother considers you a bad, bad person.

BigBrother wants to know all the people who might be looking for this drug via search engines, assuming that will lead them to the evildoers. So it tells the search engines to hand over a list of all IP addresses that are shown to have done a search for Widagra. [It mines these IP addresses and then goes back to the ISP for lists of cookies which record a unique browser identifier. Cookies and IP addresses together lead them to the computer you used.]

Why's that useful? Back to BigBrother, say they scan the list of those searching for "widagra" and decide they'd like to profile individuals on that list further. They could ask to see all the searches done from a particular IP address. [And they] see that the cookied browser of "e43UBsS4fNZzmDgj" looked for "widagria," so they order up a list of all terms that browser did. They get back:

  • widagra
  • movement to overthrow BigBrother web site
  • widagra freedom campaign
  • how can we stop evil widagra users
  • i love president bigbrother
  • email valentine's day cards
    ...and so on

Some of those searches might help BigBrother decide this particular person is an evildoer. But then again, maybe not. Maybe they were researching the evils of widagra. Maybe the browser software was in a library, where different people used it.

[Danny Sullivan's article doesn't mention it, but where authorities want to know what happened on public computers in a library, they descend on the library and carry off the computers. Under the Patriot Act they can do this in secrecy; no one in the library can talk about it. With a court-ordered supoena they can, in the regular course of law enforcement work, do the same. This UPI account describes a recent incident in a public library where this occured, extracts below.]

How long do each of the search engines keep [data on searches]?

  • AOL: Personal search histories expire after 30 days, and backups are not kept. How long log data (IP, cookied info) is maintained is not covered.
  • Google: No particular period for anything is given, which I read as nothing being destroyed.
  • MSN: Data is deleted, but not specifics are provided
  • Yahoo: No particular period for anything is given, which I read as nothing being destroyed.

MSN's deleting some, but I suspect log data is backed up and kept somewhere with no destruction policy in place. Same too, for AOL.

[Have] any of the companies handed over search data? Responses:

  • AOL: No comment
  • Google: No comment (Gmail requests have been received)
  • MSN: It has never had any criminal or civil requests for search history data
  • Yahoo: No comment
See further:

FAQ: When Google is not your friend from

More on search privacy issues from us, see these articles:

For more on the entire current fight between Google and the Department Of Justice, see these articles:

Here are extracts from the UPI account of computers seized from a public library:
UPI Intelligence Watch By JOHN C.K. DALY UPI International Correspondent WASHINGTON, Feb. 2 (UPI)


Librarians in Newton, Mass., fended off FBI demands for 30 of their computers, as they did not produce a search warrant. The FBI attempt to acquire the computers followed an unspecified Jan. 18 e-mail threat to Brandeis University, which led to the evacuation of more than a dozen buildings on the campus.

The FBI subsequently determined that the threat came from a Newton public library computer and attempted to seize 30 library computers without a warrant. Newton library director Kathy Glick-Weil told the FBI agents they could not take the computers unless they had a warrant first, and Newton mayor David Cohen supported Ms. Glick-Weil.

Following the rebuff, the FBI obtained a judicial warrant [and] with their warrant, ... took possession of the three computers.



The blog, TVCAlert, gives a couple of examples of how things can go wrong. In the first, our medical records could be made available to people who wish to use them against our interests (the obvious case: being denied a job because of medical history). In the second, government databases can be compromised through bribery of officials.
Gambling with Your Medical History

(6 Feb) A Consumer Reports investigation of electronic medical records raises concerns about keeping the information out of public view.

"Advocates of Electronic Health Records say the system will have the tightest possible security. But recent large-scale thefts of credit card and banking information have shown that all databases, even those with state-of-the-art security protections, can be compromised. Electronic medical records systems now in operation have already sprung some serious security leaks.... The full report on electronic medical records appears in the March 2006 issue of Consumer Reports which goes on sale February 7, 2006 wherever magazines are sold."

SEE: The new threat to your medical privacy Consumer Reports, March 2006 (Available by paid subscription. "A national system of electronic medical records could easily save your life. And it could also jeopardize the security of your personal health information.") PI Indicted for Obtaining Private Records

(6 Feb) Private investigator Anthony Pellicano, whose clients include celebrities and high-profile lawyers, "pleaded not guilty Monday to racketeering charges alleging he paid police officers and others to get into confidential records and provide him with information."

No comments: