Yesterday, I shared my doubts about the prospect of getting budget and organizational data from the White House. Today, I'm happy to report genuine progress on open data from Congress.
The Government Printing Office announced today that it will be making House bills available in XML format and in bulk through FDsys, GPO’s Federal Digital System. House bills now join other material on GPO's bulk data page.
If you're like me, following that link gives you some idea of what's there, but clicking through any further gives you no idea how to use it any more than other copies of bills. That's OK, because the kids with the computers do know how to use it. And they can take well structured, timely data reflecting the proposals in Congress and turn it into various information services, applications, and web sites that make all of us better aware of what's happening.
I believe the public has an Internet-fueled expectation that they should understand what happens in Congress. It's one explanation for rock-bottom esteem for government in opinion polls. Access to good data would help produce better public understanding of what goes on in Washington and also, I believe, more felicitous policy outcomes—not only reduced demand for government, but better administered government in the areas the public wants it. (If you're a reader of a certain partisan bent, you might appreciate the idea that the era of passing bills to find out what's in them will end.)
Upon the release of my Cato Policy Analysis, "Grading the Government's Data Publication Practices" I characterized President Obama as lagging House Republicans in terms of transparency. Today's development helps solidify Republicans' small lead. The GPO release says the initiative comes "[a]t the direction of the House Appropriations Committee, and in support of the task force on bulk data established by House report 112-511."
The administration has plenty of capacity to retake the lead, of course, and could do so quite easily. I'll call it like I see it, doing my best to reflect consensus among the transparency community as to the quality of data publication, when we return to grading the data produced by various organs of government in another year or so.
Did you think this praise would come without garnish? It's like you don't know me at all.
For now, this data is of limited use because it includes only House bills. The entire oeuvre of congressional bill-writers should be published the same way in the same place so that contrasts and comparisons can be drawn among House and Senate work. In short, why is the Senate not on board?
That I've been able to find, the XML is not well documented. What each of the technical codes means is understood by several people in Washington's transparency community, but the idea is to make it available very broadly, so the documentation should be very strong. The information at xml.house.gov should be updated, tightened up, and made easily available to the people gathering bill data on FDsys.
The XML data structures put in bills are limited in terms of what they convey. There is rudimentary information about who introduced and cosponsored bills, what committees they were referred to, and other procedural information. That's good. But the effects of bills—on agencies, existing law, programs, places—this is not available in machine-readable code. That would be great.
Watch this space. In the coming weeks and months, we'll show how semantically rich data can automatically reveal more about what happens in the legislative process. Technical people will be able to draw insights about legislation and the legislative process that were never available before. They will translate that for us myriad ways, better equipping the public to oversee the government.
A small but growing collection of companies has formed a coalition that will push the federal government to establish a standard system by which agencies categorize their data. ...
"Our members understand that if the government identified its data elements in consistent ways, there would be vast new opportunities for the tools that they are building," Executive Director Hudson Hollister said.
Early supporters include Microsoft and data analysis and management firms Level One Technologies, Teradata, and BrightScope. I'm on their Board of Advisors. One of their early priorities will be to pass H.R. 2146, the DATA Act.
Cato has worked extensively on government transparency, beginning with our December 2008 policy forum entitled, "Just Give Us the Data! Prospects for Putting Government Information to Revolutionary New Uses."
We have modeled much of the data that the government should be publishing in standardized formats (much more cheaply than CBO has estimated it would cost) and graded the quality of current data publication in the areas of congressional process and budgeting, appropriating, and spending. Expect improvements to come with this new organization joining other efforts.
Follow the coalition's founder and executive director on Twitter @hudsonhollister, and you can Like their Facebook page, as well, to get updates that way.
Last week was an interesting week for transparency, with some good news and some bad news.
On the "good" side of the ledger, the administration rolled out "Data.gov," a growing set of data feeds provided by U.S. government agencies. These will permit the public to do direct oversight of the kind I discussed at our "Just Give Us the Data!" policy forum back in December.
My metric of whether Data.gov is a success will be when independent users and Web sites use government data to produce new and interesting information and applications. The Sunlight Foundation has a contest underway to promote just that. Get ready for really interesting, cool, direct public oversight of the government.
Also under the White House's new "Open Government Initiative," an Open Government Dialogue "brainstorming session" began last week. The public can submit ideas for making the government more transparent, participatory, and collaborative. This is important stuff, an outgrowth of President Obama's open government directive, issued on his first full day in office.
That directive called for the Office of Management and Budget to require specific actions of agencies "within 120 days," which meant the final product was due last week. And that missed deadline is where we start to slide into the "bad" on the transparency ledger.
Last week, President Obama gave an important speech on national security (which I blogged about here and here). But you couldn't find the speech in the "Speeches" section of the Whitehouse.gov Web site. It's buried elsewhere. That's "basic Web site malpractice," I told NextGov.com. And I cautioned my friends in the transparency community not to forget Government 1.0 for all the whiz-bang Gov 2.0 projects flashing before our eyes. Whitehouse.gov should be a useful, informative resource for average Americans.
The current top proposal on the "brainstorming" site referred to above is to require a 72-hour mandatory public review period on major spending bills. This is reminiscent of President Obama's promise to hold bills five days before signing them. But, as Stephen Dinan reports in the Washington Times, the president signed several more bills last week without holding them the requisite time.
The White House protests that they posted links to bills on the Thomas Web site at the Whitehouse.gov blog. But that does not give the public meaningful review of the bills in their final form, as they have come to the president from Congress. "Posting a link from WhiteHouse.gov to THOMAS of a conference report that is expected to pass doesn’t cut it," says John Wonderlich at Sunlight.
President Obama signed nine new laws since we last reviewed his record on the "Sunlight Before Signing" promise. Alas, it's been a case study in pulling defeat from the jaws of victory.
Five of the bills were held by the White House more than five days before the president signed them, but they weren't posted! Simply posting them on Whitehouse.gov in final form would have satisfied "Sunlight Before Signing."
President Obama's average drops to .043, and that's crediting him one win for the DTV Delay Act, which was posted at Whitehouse.gov in its final form for five days after Congress passed it, but before presentment, which is the logical time to start the five-day clock.
Here is the latest tally of bills passed by Congress, including the date presented, date signed, whether they've been posted or linked to at Whitehouse.gov, and whether they've been posted for the full five days after presentment. (Corrections welcome - there is no uniform way that the White House is posting bills or links, so I may have missed something.)