Kashmir Hill at Forbes reports on the enormous quantities of data Facebook stores on its users, as revealed by records released pursuant to European “right to access” laws. One user unearthed some 880 pages of data, covering years of her account activity, including: Every machine she’d ever signed in from (and a list of other users who’d signed in from the same device), everyone who’d ever poked her, everyone she’d ever “defriended” and every friend request refused, and a history of messages and chats (including some users said they’d deleted). Some people might find it troubling how much information Facebook stores. I find it troubling how easy European laws apparently make it to extract all that data, especially when you consider that these “right to access” rules are supposed to safeguard people’s private information. As Hill notes:
One thing that I found a bit concerning about the process is that it only requires a photo of your government i.d., your name, and birthdate to confirm your identity. Given how easy it is to get one’s hands on someone else’s ID (say if you’re dating someone and s/he leaves a wallet about your house), I could imagine some scenarios in which this process could be abused.
Full disclosure here: Hill’s my domestic partner—so, fortunately, she doesn’t have to worry about that sort of thing. But in principle it sounds like it might be even easier than that. Their data request form suggests that you must submit an ID on which your “full name, date of birth, and photo” are legible, while other extraneous identifying information can be blacked out. It’s not clear from either their site or the reporting I’ve read exactly how their verification process works, but it sounds as though anyone with some rudimentary Photoshop skills and a user’s photo and birthdate (Where would you ever get those? Oh… right.) might be able to put together a passable bogus request. I had assumed the process at least required some kind of confirmation response from the e-mail associated with a user’s account (along the lines of a password reset) but at least one of Hill’s sources says he doesn’t recall going through any such step. And in any event, the submission form allows you to provide an alternate e-mail address “where you can be reached” in case you no longer have access to the login e-mail on record with Facebook. So someone who knew that their target was no longer using (say) the college address they’d signed up with, or was going to be away from e-mail for a while, or had an aggressive spam filter that’s likely to block such messages, would still be able to game the process.
Now, Facebook’s a big company with plenty of resources, so it wouldn’t be surprising if their vetting process is actually more secure (or could be made more secure) than these descriptions make it sound. There are still the myriad other Web sites that store personal user information, and any user’s data is only secure as the weakest link in the chain. For users who recycle a small number of passwords on many sites, there’s an added risk: One of the categories of data Facebook provides is a hash of the user’s password. If a site is observing good security practice and using a salted hash, that wouldn’t necessarily be of enormous use to an attacker—but if they’re not (and, sadly, many sites don’t observe best practices here) an attacker could conceivably infer a weak password from the hash using a dictionary attack with a few hours of crunching.
There are, to be sure, ways to close some of the weak points in the process. And the goal of a “right to access” regulation—enabling users to understand how they’re monitored by different sites, so they can make informed decisions about their Internet use—is a reasonable one. But just as with regulations designed to ensure lawful police access to communications, any broad mandate creating an additional access point to information systems effectively creates a new attack surface, and a new security vulnerability, which in turn adds to the burdens on the company if they’re going to be responsible data stewards. (Presumably Facebook can afford a full-time compliance team to deal with the flood of requests they’ve gotten since this story started circulating in a timely and secure fashion; it’s not hard to imagine that it would strain the resources of a smaller start-up.) It’s difficult to adequately gauge the net costs and benefits of such mandates in advance, but in this case it doesn’t seem terribly plausible that there’s a genuine gain to consumers that could justify the added risk or expense—especially since most of the benefit here would be equally achieved by requiring companies to supply detailed general information about the kinds of records they keep.