Topic: Telecom, Internet & Information Policy

Making Sense of Drug Violence in Mexico with Big Data, New Media, and Technology

Yesterday we hosted a very interesting event with Google Ideas about the use of new media and technology information in Mexico’s war on drugs. You can watch the whole thing in the video below.

Unfortunately, one of the biggest casualties from the bloodshed that besets Mexico is freedom of the press. Drug cartels have targeted traditional media outlets such as TV stations and newspapers for their coverage of the violence. Mexico is now the most dangerous country to be a journalist. However, a blackout of information about the extent of violence has been avoided because of activity on Facebook pages, blogs, Twitter accounts, and YouTube channels.

Our event highlighted the work of two Mexican researchers on this topic. Andrés Monroy-Hernández from Microsoft Research presented the findings of his paper “The New War Correspondents: The Rise of Civic Media Curation in Urban Warfare” which shows how Twitter has replaced traditional media in several Mexican cities as the primary source of information about drug violence. Also, we had Javier Osorio, a Ph.D. candidate from Notre Dame University, who has built original software that tracks the patterns of drug violence in Mexico using computerized textual annotation and geospatial analysis.

Our third panelist was Karla Zabludovsky, a reporter from the New York Times’ Mexico City Bureau, who talked about the increasing dangers faced by journalists in Mexico and the challenges that new media represent in covering the war on drugs in that country.

Even though Enrique Peña Nieto, Mexico’s new president, has focused the narrative of his presidency on economic reform, the war on drugs continues to wreak havoc in Mexico. Just in the first two months of the year over 2,000 people have been killed by organized crime. 

At the Cato Institute we closely keep track of developments in Mexico and we have published plenty of material on the issue, including:

Watch the full event:

And for those who speak the language of Cervantes, here’s a ten minute interview that Karla Zabludovsky and I did on CNN en Español about the Cato event.

Google Illuminates the Shadowy World of National Security Letters

In a pretty much unprecedented move, Google today announced that it was expanding its regular “Transparency Report” to include some very general information about government demands for user information using National Security Letters, which can be issued by the head of any of 56 FBI field offices without judicial approval or supervision. Recipients of NSLs are typically forbidden from ever revealing even the existence of the request, and therefore not included in the company’s general tally of government surveillance requests. Instead of disclosing specific numbers of NSL requests, then, Google is publishing a wide range indicating the rough volume of requests they get each year, and how many users are affected. Broad as these ranges are, there’s some interesting points to be gleaned here:

NSL's Google has received since 2009

It’s illuminating to compare the minimum number of users affected by NSLs each year to the numbers we find in the government’s official annual reports. In 2011—the last year for which we have a tally—the Justice Department acknowledged issuing 16,511 NSLs seeking information about U.S. persons, with a total of 7,201 Americans’ information thus obtained. That’s actually down from a staggering 14,212 Americans whose information DOJ reported obtaining via NSL the previous year. Remember, this total includes National Security Letters issued not just to all telecommunications providers—including online services like Google, broadband Internet companies, and cell phone carriers—but also “financial institutions,” which are defined broadly to include a vast array of businesses beyond such obvious candidates as banks and credit card companies.

What ought to leap out at you here is the magnitude of Google’s tally relative to that total: They got requests affecting at least 1,000 users in a year when DOJ reports just over 7,000 Americans affected by all NSLs—and it seems impossible that Google could account for anywhere remotely near a seventh of all NSL requests. Google, of course, is not limiting their tally to requests for information about Americans, which may explain part of the gap—but we know that, at least of a few years ago, the substantial majority of NSLs targeted Americans, and the proportion of the total targeting Americans was increasing year after year. As of 2006, for instance, 57 percent of NSL requests were for information about U.S. persons. So even if we reduce Google’s minimum proportionately, that seems awfully high.

There’s a simple enough explanation for this apparent discrepancy: The numbers DOJ reports each year explicitly exclude NSL requests for “basic subscriber information,” meaning the “name, address, and length of service” associated with an account, and only count more expansive requests that also demand more detailed “electronic communications transactional records” that are “parallel to” the “toll billing records” maintained by traditional phone companies. I’ll get back to what that means in a second. But the obvious inference from comparing these numbers, unless Google gets a completely implausibly disproportionate percentage of total NSLs, is that the overwhelming majority of NSLs are just such “basic subscriber information” requests, and that the total number of Americans affected by all NSLs is thus vastly, vastly larger than the official numbers would suggest.

The rationale for not counting such “basic subscriber information” requests—beyond a desire not to terrify Americans by exposing the true magnitude of government surveillance—is presumably that these are so limited in scope that they don’t pose the same kind of civil liberties concerns as more extensive data requests. But this may not really be the case when you think about how we use the Internet in practice: Many people, after all, go online to engage in anonymous speech. In those cases, the contents of a person’s communications may be public (or at least widely shared), and what’s sensitive and private is the identity of the person tied to a particular account. (The first step in the FBI investigation that ultimately brought down CIA chief David Petraeus, recall, was stripping away the digital anonymity of his biographer and lover, Paula Broadwell, by linking a pseudonymous e-mail address to her primary Google account.) Indeed, that seems to be the primary reason one would issue such a “basic subscriber information” request to an entity like Google: To effectively de-anonymize the otherwise unknown user of a particular account. Insofar as the right to both speak and read or recieve information anonymously has long been recognized by the Supreme Court as a component of our basic First Amendment freedoms, even these relatively limited requests may indeed have important implications for our civil liberties. And Google’s numbers, imprecise as they are, very strongly suggest that such requests are issued in far higher numbers than had previously been recognized.

The other interesting tidbit to come from Google today is their expanded FAQ detailing what kinds of information can be obtained under NSLs:

Under the Electronic Communications Privacy Act (ECPA) 18 U.S.C. section 2709, the FBI can seek “the name, address, length of service, and local and long distance toll billing records” of a subscriber to a wire or electronic communications service. The FBI can’t use NSLs to obtain anything else from Google, such as Gmail content, search queries, YouTube videos or user IP addresses.

For a long time, the FBI operated on the assumption that NSLs could be used broadly to obtain any “electronic communications transactional records.” But in a 2008 memorandum, the Office of Legal Counsel rejected that interpretation, holding that NSL authority “reaches only those categories of information parallel to subscriber information and toll billing records for ordinary telephone service.” Just what that means, of course, is fairly opaque—but I think most observers had supposed, as I had, that it encompassed user IP addresses. Since these can be crucial to linking a wide array of online activity to a particular user, their exclusion would somewhat limit the potential of NSLs to undermine Internet anonymity. Whether IPs are covered, however, may well depend on the specific service in question—and it is not at all clear whether other providers will disclose IP addresses in response to NSLs.

Of course, what Google does not specify clearly is just what information does fall into the category of “toll billing records.” In all likelihood, however, it covers the equivalent of the kind of information about who is communicating with whom that might be found on a phone bill—such as a list of all the people with whom you exchange e-mails or Gchat instant messages, though again, given differences in how people use the Internet versus traditional phone service, such lists are likely to be substantially more revealing than any phone bill.

DNA and Doctrine in the Supreme Court

This week, the Supreme Court considered whether collecting DNA from an arrestee was an unreasonable Fourth Amendment search.

Or at least that would have been a good way for the Court to frame the question.

Instead, much of the oral argument in Maryland v. King dealt with the question whether swabbing the cheek of an arrestee to take a DNA sample upsets one’s reasonable expectations of privacy. The “reasonable expectation of privacy” test is doctrine that arose from Justice Harlan’s concurrence in Katz v. United States. The test asks whether a person claiming the Fourth Amendment’s protections had a subjective expectation of privacy and whether it is “one that society is prepared to recognize as ‘reasonable.’”

The government’s case rests on that framing, which is why Deputy Solicitor General Michael Dreeben began his argument by saying that arrestees are “on the gateway into the criminal justice system. They are no longer like free citizens who are wandering around on the streets retaining full impact Fourth Amendment rights. The arrest itself substantially reduces the individual’s expectation of privacy.”

It’s true that an arrestee has his privacy and other liberties invaded various ways. What problem is it if a bit of DNA is collected at the same time? It’s pretty much like finger printing, the argument goes…

The “reasonable expectation” test is almost never faithfully followed by courts. My guess is that the Court will not assess whether King himself actually expected “privacy.” That would encompass everything from believing that none of his mucus membranes would be collected by a government agent, to believing that his genetic material would neither be analyzed nor preserved in a Maryland lab for further analysis somewhere in an uncertain future.

When it applies the objective part of the test, there is a chance, but I’ll be surprised if any justice actually examines the difference in experience between fingerprinting and DNA collection, such as by comparing the slim privacy invasion when one person touches another’s hands to the real invasion that occurs when a person puts something in another person’s mouth. Doing so in its exercise of free-form interest balancing could, but probably wouldn’t, overcome the government’s interest in using “the fingerprinting of the 21st Century” to catch crooks.

Rather than using doctrine and making policy judgments, the Court should assess the government’s actions as the Fourth Amendment commands. The law does not invite the Court to examinine what people may or may not think about “privacy.” It bars the government from committing unreasonable searches and seizures.

If one examines the case guided by the words of the Fourth Amendment, what happened is far more clear. Taking a bodily specimen from Alonzo King was, in natural language, a seizure. Processing that specimen to create an identity profile was a further examination, bringing otherwise concealed information into law enforcement’s view. And comparing King’s identity profile to cold-case profiles was incontrovertibly looking for something. This is all searching using that seized bodily material.

Now, was the search reasonable?

Having been picked up on a variety of assault charges, King’s mouth was swabbed and his DNA taken, processed, and used to investigate whether genetic material matching his was associated with any other cases. It’s the equivalent of taking keys on the person of an arrestee and looking through his house for evidence of other crimes. There was no relationship between King’s alleged wrongdoing and the investigation conducted using his DNA.

Perhaps it is reasonable to conduct a free-form search into the biography of a person who has been arrested–that is, a person about which a law enforcement officer says he has probable cause to arrest–but it is unlikely. The Fourth Amendment’s particularity requirement suggests that it is unreasonable to investigate a person arrested for one crime to see what other, unrelated crimes he may have committed.

Counsel for the State of Maryland rested her argument heavily on the use of information about other crimes in bail decisions. This falls apart under the same logic, unless the Court is going to produce a rule that the Fourth Amendment allows the government carte blanche to search and seize when a bail hearing is pending. And the DNA results came back months after Alonzo King’s arraignment.

Why You Shouldn’t Believe the Cyber-War Hype

Constantine von Hoffman explains it on CIO.com:

Cyber war is not what the Chinese currently appear to be up to. That’s called spying. If you doubt it consider what Rep. Mike Rogers, chair of the House Intelligence Committee, said Sunday on one of those talk shows that no one outside of D.C. watches:

“They use their military and intelligence structure to [steal] intellectual property from American businesses, and European businesses, and Asian businesses, repurpose it and then compete in the international market against the United States.”

If stealing secrets is an act of war then America is currently at war with all of its allies.

That’s some crisp contrarianism, and I like the dig at D.C.’s self-importance.

At around the time I was reading this article yesterday, an email arrived in my inbox touting an upcoming book event on “Cyber Warfare: How Conflicts in Cyberspace Are Challenging America and Changing the World.”

Oh, there’s no shortage of challenges laid before all actors trying to secure computers, networks, and data, but don’t mistake the number of vulnerabilities or threats with the likelihood they will manifest themselves, or the consequence if they do. The “cyberwar” frame is inapt, and looking at cybersecurity through a geopolitical lens is not likely to produce policies that cost-effectively protect our wealth and values.

Secret Spying and the Supreme Court’s Constitutional Catch-22

The memory of the abuses perpetrated by colonial officials wielding “general warrants” inspired the framers of our Constitution’s Fourth Amendment to constrain the government’s power to invade citizens’ privacy. With today’s 5-4 ruling in Clapper v. Amnesty International, the Supreme Court has announced that the modern equivalent of those general warrants—dragnet surveillance “authorizations” under the FISA Amendments Act—will be effectively immune from Fourth Amendment challenge.

The FAA permits the government to secretly vacuum up Americans’ international communications on a massive scale, without any individualized suspicion—and at least some of that surveillance has already been determined to have violated the constitution by a secret intelligence court. Yet today’s majority has all but guaranteed no court will be able to review the constitutionality of the law as a whole by imposing a perverse Catch-22: Even citizens at the highest risk of being wiretapped may not bring a challenge without proof they’re in the government’s vast database. The only problem is the government is never required to reveal who has been spied on.

In essence, the Court has said that even if the law is unconstitutional, even if it has violated the Fourth Amendment rights of thousands of Americans, there’s no realistic way to get a court to say so.

Precisely when secrecy shields the government from public political accountability, the Clapper ruling announces, the Constitution is powerless to protect us as well.

I’ll have a more detailed analysis of the ruling (and dissent) tomorrow.

Legislative Data and Wikipedia Workshop—March 14th and 15th

In my paper, “Publication Practices for Transparent Government,” I talked about the data practices that will produce more transparent government. The government can and should improve the way it provides information about its deliberations, management, and results.

“But transparency is not an automatic or instant result of following these good practices,” I wrote, “and it is not just the form and formats of data.”

It turns on the capacity of the society to interact with the data and make use of it. American society will take some time to make use of more transparent data once better practices are in place. There are already thriving communities of researchers, journalists, and software developers using unofficial repositories of government data. If they can do good work with incomplete and imperfect data, they will do even better work with rich, complete data issued promptly by authoritative sources.

We’re not just sitting around waiting for that to happen.

Based on the data modeling reported in “Grading the Government’s Data Publication Practices,” and with software we acquired and modified for the purpose, we’ve been marking up the bills introduced in the current Congress with “enhanced” XML that allows computers to automatically gather more of the meaning found in legislation. (Unfamiliar with XML? Several folks have complimented the explanation of it and “Cato XML” in our draft guide.)

No, we are not going to replace the lawyers and lobbyists in Washington, D.C., quite yet, but our work will make a great deal more information about bills available automatically.

And to build society’s capacity “to interact with the data and make use of it,” we’re hoping to work with the best outlet for public information we know, Wikipedia, making data about bills a resource for the many Wikipedia articles on legislation and newly passed laws.

Wikipedia is a unique project, both technically and culturally, so we’re convening a workshop on March 14th and 15th to engage Wikipedians and bring them together with data transparency folks, hopefully to craft a path forward that informs the public better about what happens in Washington, D.C. We’ve enlisted Pete Forsyth of Wiki Strategies to help assemble and moderate the discussion. Pete was a key designer of the Wikimedia Foundation’s U.S. Public Policy Initiative—a pilot program that guided professors and students in making substantive contributions to Wikipedia, and that led to the establishment of the Foundation’s Global Education Program.

The Thursday afternoon session is an open event, a Wikipedia tutorial for the many inexperienced editors among us. It’s followed by a Sunshine Week reception open to all who are interested in transparency.

On Friday, we’ll roll up our sleeves for an all-day session in which we hope Wikipedians and experienced government data folks will compare notes and produce some plans and projects for improving public access to information.

You can view a Cato event page about the workshop here. To sign up, go here, selecting which parts of the event you’d like to attend. (Friday attendance requires a short application.)

Why Have a Machine-Readable Federal Government Organization Chart?

When I write and talk about getting better data about the federal government, its activities, and spending, I mostly have in mind strengthening public oversight by bringing computers to bear on the problem. You don’t have to know much about transparency, organizational management, or computing to understand that having a machine-readable government organization chart is an important start.

There should be a list, that computers can process, showing what agencies, bureaus, programs, and projects exist in the federal government and how they are related. Then budgets, bills in Congress, spending programs and actual outlays, regulations, guidance documents, and much more could be automatically tied to the federal organizational units affected and involved.

But it’s not only public oversight that would benefit from such a list.

Mike Riggs at Reason magazine has found that the Office of Management and Budget’s sequestration report issued last September listed a cut to the National Drug Intelligence Center’s budget even though the NDIC went out of business last June.

The first line item on page 121 of the OMB’s September 2012 report says that under sequestration the National Drug Intelligence Center would lose $2 million of its $20 million budget. While that’s slightly more than 8.2 percent (rounding error or scare tactic?), the bigger problem is that the National Drug Intelligence Center shuttered its doors on June 15, 2012–three months before the OMB issued its report to Congress.

That’s embarrassing for the administration, as it should be. Riggs asks, “Might there be other errors in the OMB’s report?”

Getting organized is not just about public oversight. Another reason to have a machine-readable federal government organization chart is to improve internal management and controls. This kind of mistake should be nearly impossible. People at OMB should be able to download the list of government entities at any time, day or night, and be sure that it is the correct listing that uniquely identifies and distinguishes all the organizational units of the federal government at that moment. We should be able to download it, too.

Unfortunately, OMB controller Danny Werfel has been riding the brake on transparency. He and the Obama administration as a whole should be stepping on the gas. In early February, the Sunlight Foundation found that more than $1.5 trillion in federal spending for fiscal year 2011 was misreported on USASpending.gov.