Tuesday, May 15th, 2012

Harvard University’s 12 million records now in LibraryThing

Short version. Our “Overcat” search now includes 12.3 million records from Harvard University!

Long version. On April 24 the Harvard Library announced that more than 12 million MARC records from across its 73 libraries would be made available under the library’s Open Metadata policy and a Creative Commons 0 public domain license. The announcement stunned the library world, because Harvard went against the wishes of the shared-cataloging company OCLC, who have long sought to prevent libraries from releasing records in this way. (For background on OCLC’s efforts see past blog posts.)

It took a while to process, but we’ve finally completed adding all 12.3 million MARC records (3.1GB of bibliographic goodness!) to LibraryThing. They’ve gone into OverCat, our giant index of library records from around the world—now numbering more than 51 million records! As a result, when searching OverCat under “Add books,” you’ll now see results “from Harvard OpenMetadata.”

This release (“big data for books,” as David Weinberger calls it) is, to put it mildly, a Very Big Deal. Harvard’s collections are both deep and broad, covering a wide variety of languages, fields, and formats. The addition of these 12 million records to OverCat has significantly improved our capacity for the cataloging of scholarly and rare books, and greatly enhanced our coverage generally.

Kudos to Harvard for making this metadata available, and we hope that other libraries will follow suit.

For more on the metadata release, see Quentin Hardy’s New York Times blog post, the Dataset description, or the Open Metadata FAQ. And happy cataloging!

Come discuss here.

Harvard requests and we’re happy to add: The “Harvard University Open Metadata” records in OverCat contain information from the Harvard Library Bibliographic Dataset, which is provided by the Harvard Library under its Bibliographic Dataset Use Terms and includes data made available by, among others, OCLC Online Computer Library Center, Inc. and the Library of Congress.

Labels: cataloging, open data


  1. Richard says:

    This is stunning, wonderfully so but literally stunning, news. My literal ears are ringing from the blow. (Which was caused, to be precise, by my head hitting my desk on reading this news.)

  2. Robin Wendler says:

    It’s great that you’ve incorporated the records! I’d like to clarify that Harvard worked closely with OCLC throughout this process. Jim Michalko described this a bit in his blog post “Harvard bibliographic data released with prominent nod to OCLC”, http://hangingtogether.org/?p=1647

  3. Connie says:

    This is absolutely terrific! Many thanks!

Leave a Reply to Richard