Subscribe via RSS

Archives by Date
September 2008
August 2008
July 2008

See all Archives
Archives by Category
Afghan Update
Ammo and Munitions
Around the Globe
Av Week Extra
Axe in Iraq (and Elsewhere)
Blog Bidness
Body Armor Blues
Bomb Squad
Brownshoes in Action
Bubbleheads, etc.
Cammo Green
Catch the "Buzz"
Civilian Apps
Cloak and Dagger
Contingency Ops
Cops and Robbers
Data Diving
Defense Tech Poll
Dissent Tech
Door Kickers
DT Administrivia
Eat DT's Dust
Extra! Extra!
Eye on China
Fast Movers
FCS Watch
Fire for Effect
FOS Files
Friday Funnies
Gadgets and Gear
Going Green
Grand Ole Osprey
Ground Vehicles
Homeland Security
In the Weeds with Eric
Info War
Iraq Diary
Jarhead Jazz
JSF Watch
Just War Theories
Lasers and Ray Guns
Los Alamos and Labs
M4 Monopoly
Money Money Money
Most Wanted
Old Skool
Our Shrinking Planet
Planes, Copters, Blimps
Polmar's Perspective
Popular Mechanics
Rapid Fire
Raptor Watch
Red Team
Roll Your Own
Sabra Tech
Ships and Subs
Special Ops
Star Wars
Stray Trons
Tactical Development
Terror Tech
The Deadlies
The Defense Biz
The Peoples' Site
The Sunday Paper
The Tanker Tango
The View from Av Week
Those Nutty Norks
Training and Sims
Trimble on the Case
Video Lounge
War Update
Ward'z Wonderz
You can run...

See all Archives

Edited by Christian Lowe | Contact

AOL Leak: Toward Searchcrime?


A research site at America Online posted three months of search records for 500,000 people (over 20 million searches) on the Internet recently. The data was discovered over the weekend and news of it has quickly spread across the blogosphere and into the mainstream media. AOL rapidly removed the data from its site, but the cat's already out of the bag - the files were copied, and have been replicated all over the Internet.

Anyone can download the 439mb file, just like I did last night. People are already poring through the data, finding some very disturbing search patterns among a number of AOL's users. In theory, there is no personally-identifiable information on the database, but if people ran searches that identify things about themselves, it often becomes easy to figure out who they are. In many ways, this is a worse privacy loss than the laptop stolen from the Veterans Administration employee earlier this spring, if it had been compromised.

This inadvertent disclosure of data forces the need for a public debate on the retention and use of search data by private companies, and the propriety of its use by government agencies. In January we learned that Google refused a DOJ subpoena to supply the government with exactly this kind of data - a request with which Yahoo!, AOL and MSN complied. These companies are compiling petabytes of search data on their servers, effectively archiving the collective subconscious of hundreds of millions of people.

This information clearly has value from a marketing and business intelligence perspective, which is why the search companies are retaining it. But this data then becomes an overly tempting target for homeland security and counterterrorism officials. Should they able to access it? Under what conditions? By whom? And what is the actual value of the search information? We need to answer these questions, and in doing so develop a clear framework to guide how and when such information should be available to government officials, rather than continuing along in the legal and policy vacuum that the United States is in today.

We need a framework that allows narrow access to this search data in cases where a person or group is under investigation for activities related to terrorism, counterintelligence, and/or WMD proliferation. But I would forbid access to this search data for the purpose of conducting wide-ranging analysis of search data - looking for needles in the haystack - because the benefits would not be nearly commensurate with the massive privacy hit. And the search companies need to be more responsible in their utilization of this data, and develop policies and systems for destroying data after a finite period of time (1-2 years), and give users the ability to clear and remove personally-identifiable search histories from company servers.

This assessment is based in part on some cursory analysis of the AOL data last night. In cases where I found "suspicious" searches, I could never be certain about the actual intent of the search. This inability to divine intent from searches will naturally lead to high percentages of false positives. For example, anyone who works in the homeland security field, as I do, is likely to run searches related to terrorist tactics, infrastructure protection, etc. These searches are all false positives, and likely will drown out any "real" terrorist search activity. Efforts to investigate these searches would therefore be expensive, and less productive than traditional means of intelligence and investigation.

If the federal government is allowed unfettered access to this data, we run the risk of creating a new Orwellism - Searchcrime - that is an inefficient response to the war on terror.

-- Christian Beckner, cross-posted from Homeland Security Watch.


You state that: "This inability to divine intent from searches will naturally lead to high percentages of false positives. For example, anyone who works in the homeland security field, as I do, is likely to run searches related to terrorist tactics, infrastructure protection, etc."

I agree but would push the point further: in the post 9-11 world with our troops in multiple fields of combat, and terrorists acts a daily reality, ANYONE (not just Homelnd Security employees) with even a remote interest in current events would have a legitimate reason to conduct a search on ANY terrorist related topic. Iran may have the bomb? Well I wonder how easy it is to put a nuke together? Let me check Google. Al Quaeda not jobs are streaming video on their own websites? I want to see that, I'll search AOL. Hundreds of millions of internet users that are no threat to anyone can be expected to conduct many such searches. From an investigative point of view - a certain dead end.

Posted by: Darius Teter at August 9, 2006 03:38 AM

Post a comment

Remember Me?

Please enter the code as seen in the image below to post your comment.