I have written here a couple of times about concerns with Google’s data retention practices in light of its susceptibility for use in government surveillance.
Now, there are some interesting details - details that are highlighted by the text I quoted above. “Anonymous” is correctly regarded as an absolute condition. Like pregnancy, anonymity is either there or it’s not. Modifying the word with a relative adjective like “more” is a curious use of language.
Google has a challenge, if they’re going to anonymize data and not destroy it, to make sure that a person’s identity and behavior cannot be reconstructed from it. As AOL’s fiasco with releasing “anonymized” search data showed, clipping off the obvious identifiers won’t do it. As data mining capabilities advance, anonymizing techniques will have to keep ahead of that.
There are interesting things that can be done to synthesize data, making it statistically relevant while factually incoherent. Hopefully, Google will sic some of its finest famously-smarty-pants engineers on the task of making their anonymous data really, really anonymous.
(Cross-posted from TechLiberationFront)