CFR.net

your (unofficial) resource for a usable Code of Federal Regulations

or

Shame on the U.S. Government Printing Office

July 8, 2011

The online Code of Federal Regulations at the "official" site, now moved to the Federal Digital System, at first glance appears to be a pretty botched effort, in particular the "bulk data" XML files. Sadly, it appears botched on the second, third, and subsequent glances, as well.

I'm hardly an expert in XML, but I know a ham-handed kludge when I see one. Even the "XML expert" at the Legal Information Institute, at Cornell Law School described the FDSys's release of the CFR in XML as "a bag of tags."1 This assessment, in my opinion, falls short of the truth, but given the public profile of the LII, and its need for an ongoing relationship with the GPO, etc., an excusable understatement.

A steaming pile of tags, is more appropriate.

Rather than a data-centric XML file set, with accompanying XSLT to style for presentation, including printing, the good civil servants at FDSys released XML files with all the printing cruft, including tables of contents and page numbers. Even an XML novice would know that the best practice, even the intent, of XML/XSLT, is to separate, to the extent practical, the information from the presentation.

The "for the record" explanation for this ham-handed approach to stuffing a print volume into an XML file, as is, appears to have been quietly slipped into the user manual. "The schema being produced for this effort describes the data as it actually occurs from the OFR. Documents are not being cleaned up because they do not match the schema; instead, the schema was selectively relaxed."2

"Selectivedly relaxed?" The current XML version, as published on FDSys is about one step better than if they had taken the full text of the CFR and slapped open and close paragraph tags at either end. The GPO could have well produced a semantic pair of XML data file and XSL style sheet that would have rendered a printable output, including tables of contents and recurring top and bottom disclosures, very similar to the $1600 printed volume - if that had been their actual intent.

That does not, however, appear to have been the intent. The product appears intended to obfuscate the information and thwart reusers and republishers efforts to use the published XML files.

The three likely reasons for this sloppy, ugly, and unusable product are:

  1. The folks at FDSys were expressly told to do it this way, and attempts to educate the client organization about what a stupid approach this was, were ignored.
  2. The folks at FDSys just don't know any better. The project was handed to an XML novice, along with a copy of O'Reilly's, "Beginning XML," and a deadline to get it done now.
  3. The management at FDSys see producing a usable, accessible XML version of the Code of Federal Regulations, properly and effectively coded, as a quick way to work themselves out of a job. This resulting electronic documents are their passive-aggressive efforts to ensure their job security. Another case of the management of a federal agency observing the letter of its directives, while bluntly thwarting the intent. And at your expense.

Presumably, if the XML files were well-designed, every set of files downloaded for free would be a $1600 print volume unsold.

Edit 7/13/2001

Ms. Frug diplomatically gives GPO an "out," mercifully pointing out that GPO is merely publishing what OFR sends it, and further, OFR only sends on what it receives from the several agencies. Ms. Frug's kid gloves approach to assessing the sorry state of these electronic documents is politically wise; neither the LII nor its parent Cornell University, would be well-served by a scorched earth approach to a review, calling out agency heads and document editors by name and telephone extension for their incompetence. Ms. Frug's "bless their little hearts" tone is appropriate for her post-mortem paper. Yet, the files published on FDSys are still an excrement sandwich, and passing the buck back to some illusive, recursive source of "The Error," doesn't fix things.

Ultimately, someone has to fix this steaming pile, if only under the threat of "accountability." This is no place for the Nuremberg Defense. So, let's start here - Jonn V. Lilyea, Chief Editor, "under the direction of" Michael L. White - at least a portion of this defective product has your names on it. Will the next issue of these documents evince the same bungling, or are you going to fix it?

 

In any event, what FDSys has published as an XML version of the CFR is all but useless. The steaming pile of tags over at http://www.gpo.gov/fdsys/bulkdata/CFR are a fine example of your federal tax dollars at work. Better than a poke in the eye, but not by much. <insert four slow claps here />

The LII have done an admirable job attempting to unkludge the source materials. We're going to take it further and publish the improved, if unofficial version

Here.

Watch this space.

 

1. Frug, Sara S., Ground-up law: Open access, Source quality, and the CFR, at the 2011 Cali Conference for Law School Computing, section 2.4.2

2. US GPO, Code of Federal Regulations XML Rendition, December 17, 2009, page 5.