Delivering Government Transparency

May/​June 2014 • Policy Report

“The history of liberty has been in no small measure the struggle between diffuse and encompassing interests, on the one hand, and special interests, on the other,” John McGinnis of Northwestern University has written.

Before the printing press was invented, these interests included both rulers and aristocrats. But the mass dissemination of information “allowed the middle class to discover and organize around their common interests to sustain a democratic system that limited the exactions of the oligarchs.” Today, the Internet is the new printing press. With the launch of Cato’s Deepbills Project, the Institute is generating data that will allow the information superhighway to have its salutary effects for liberty.

“Democracy in America is not working: the formalities are strong, but the substance is hollowed out,” says Jim Harper, a senior fellow at the Cato Institute specializing in information policy. While there is quite a bit of public sector information online, it is buried in archaic practices and unusable formats. In effect, the government is beyond the reach of the people. The Internet has remade old industries and created new ones, yet it has barely touched the federal government. The solution to this information problem, Harper argues, is detailed accessible information about the government’s deliberations, management, and results.

To this end, Cato’s Deepbills project gathers the XML versions of legislation, annotating 99 percent of the bills introduced in the current Congress in order to make key elements of their content easily readable by computer. This data allows automatic discovery of the laws that pending bills amend, the agencies they affect, and the spending they authorize. Not content to merely advocate for transparency, Cato has shown the way by producing data that makes the government more transparent.

The data is already beginning to see use. The New York Times began employing Deepbills data to flesh out information about bills in Congress that it publishes on its website. On Wikipedia, the Institute has initiated an automated system for generating article skeletons that include the pros and cons of legislation in Congress. Bit by bit, a community of editors is starting to grow. The project is currently developing open‐​source markup software that will add this data to bills automatically. Publishing well‐​structured data will allow search engines, websites, researchers, reporters, political scientists, and the public to discover, process, and use government information in any way they choose. Cato’s work will set public expectations that government data is available, making government’s improvement in this area a political imperative. “By adding data sets to what’s available about government deliberations, we’re beginning to lift the fog that allows Washington, D.C., to work the way it does — or, more accurately, to fail the way it does,” Harper concludes.