The SJ Mercury News (and lots of other people) are reporting that the US Justice department is trying to get Google to disclose massive amounts of search index data. What’s unique and troubling about this is that Justice aren’t claiming that Google have done anything wrong or that the Google information directly relates to any crime: they just want use Google’s index as a way to save themselves the hassle of indexing the web for themselves.
This is sheer laziness (and nigh on lunacy). If Justice wanted to crawl the whole web and index it, most people would have a cow. We would scream about unwarranted government oversight into free speech and expression. We would complain about the wasted government resources from an administration that appears to be a little bit obsessed with sex. Likewise, we should complain just as loudly when they ask Google to use their information for free.
There are lots of appropriate ways for the US Federal government to use information from private companies to carry out legitimate government missions. Heck, Renesys is involved in several of these. We believe that the information services that we provide about the Internet are useful as part of various government efforts to understand and secure infrastructure. We work with several US Federal agencies and groups providing services for Internet infrastructure protection, and we’re happy to do so. Those people have hard jobs and our information and tools hopefully make those jobs a little bit easier. That’s something we’re proud of. But we would resist being forced to give up that kind of information for free, even though the information that we offer is global in nature and does not offer much in the way of individually identifiable information.
Now, look quickly at the purpose the Justice Department states for the required data from Google: they want one or more big search indexes to try to determine the probability that a child will receive “inappropriate content” as the result of a search. Given that they want to regulate “inappropriate content” with protection of children as the stated goal, this seems like a reasonable research question. Guess what? Amazon’s Alexa search engine has opened up their entire index for commercial use (more about possible other implications of this in some future posting, I’m sure). Moreover, Alexa tags every result page with “porn”/”not porn”. According to Alexa, indexing the whole web costs a few thousand US$. So why doesn’t Justice just pay for it rather than try to get it for free from Google?
It’s even scarier to note that the Justice Department claims that “other, unspecified search engines have agreed to release the information, but not Google.” And you thought Google was overreaching and concerning? What about those “unspecified search engines” who turned over this data?