January 31, 2012 2:42PM

Helping the House Advance Data Transparency

The House of Representatives is poised to make great strides forward in transparency, and our work over the last year aims to help them do that. Here’s how this spreadsheet (.xls) will do that.

In December, the House Administration Committee announced a plan to improve the publication of House documents. In January, a new site—docs​.house​.gov—went live. (It’s attractive looking, but still bare‐​bones.) On Thursday this week, the Committee is hosting a “Legislative Data and Transparency Conference” to examine what data is out there and what data should be out there. Little information is on the Web yet, but you can sign up to attend at the link just above.

I’ll be speaking on the last panel of the day, which deals with measuring transparency success. Likely, they chose me for this panel because I’ve already been grading the government on its publication practices.

Last September, you see, we graded Congress on how well it publishes data that would assist the public in computer‐​aided oversight. The summary blog post is called “Needs Improvement.” And then in December, we graded the government on publication of budget, appropriations, and spending data. That’s a joint legislative‐​executive responsibility, but mostly executive. The message was: “ ‘Needs Improvement’ is Understatement.”

How do you grade Congress and the government on their data publication?

You start out by modeling the data government should publish. We put together a data model for legislative process, for example, and then a data model for budgeting, appropriating, and spending. We got a great deal of help from folks at the Sunlight Foundation, OMB Watch, and others such as the National Priorities Project, as well as data guru Josh Tauberer, whose latest project is PopVox.

Even with all this help, these models won’t be the last word—there is much to learn yet about the data structure that will serve every use the public may want to make of information. But it’s a strong start.

Then we compared the data that’s actually out there to the practices described in my paper, “Publication Practices for Transparent Government,” and out popped the grades! They were pretty bad…

The House of Representatives aims to fix that—for its part, at least.

Now to this spreadsheet: it’s a list of the things that should be identified in congressional documents so that computers can find the most salient information in them. It also indicates the “vocabularies” that already exist for identifying many of them: members of Congress, bills, laws, statutes, committees, agencies, programs, and so on. We’ve talked about how to identify “budget authority” and appropriations (spending) so that computers can capture that information from bills and committee reports. Locations, state and foreign governments, times, meetings—all these things can be put into electronic versions of documents to allow computer‐​aided public oversight.

Once documents contain data like this in the proper structures, literally thousands of questions about Congress will be answered instantly.

  • How much new budget authority has each member of Congress proposed? Voted for? Voted against? Allowed to go through on voice vote or unanimous consent? How about this same information by state? By region? Or by seniority?
  • What title of the U.S. code do members of Congress most often propose to amend? What title do they actually amend the most?
  • What bills affect my state specifically, such as by naming buildings, creating wilderness areas, changing boundaries on parks, or giving land to localities?
  • How often do my member of Congress and senators break with their party?

These are just a few examples. In the hands of varied users, the data will be converted to hundreds or thousands of uses. It will go into studies performed by political scientists and it will supercharge news reporting. But more importantly, it will go into services that inform people directly and quickly about how their own representatives in Congress are acting and what they’re saying.

It will give people insight into where the money goes—from the moment new spending is proposed all the way through to when Congress spends it—or declines to spend.

Credit is due to the leadership in the House of Representative for starting this work. There is a lot to do before they show clear success. But they are way ahead of President Obama, whose Sunlight Before Signing transparency promise lags badly, and who has yet to put together a machine‐​readable organization chart for the executive branch of the federal government. He can easily do the latter, and coordination with Congress is essential for transparency success. The sooner that happens the better.