Sunday, December 7th, 2008

The Elusive Moose and OCLC

Over the next few weeks I’m going to try to approach this issue in a number of different ways. Here’s a first try.

Thought experiment. I walk into the Portland, ME public library and look up The Elusive Moose. Who owns the database record, with the title and subjects and so forth?

Who owns it? Joan Gannij wrote the book, and Clare Beaton illustrated it. Barefoot Books of Cambridge, MA published it.

To qualify for the Library of Congresses Cataloging in Print program, Barefoot Books filled out forms and submitted the basic data. The Catalogers at the Library of Congress used that and some sample chapters and made the basic record record from that, providing the publisher with the core cataloging information it printed in the book.

Then people at three other companies improved the record–Ingram Library Services, Baker & Taylor (twice), and Yankee Book Peddler. After that catalogers at two public libraries worked on it–the Vancouver Public Library and the Southfield Public Library. Finally the Anchorage, Alaska School District added the finishing touches. No doubt they know from moose! (LibraryThing, located in Portland, ME knows about moose too!)

Whose record is it? The authors? Publisher? The companies? The public libraries? The school library? How about me, or nobody? Aren’t libraries supposed to be about free access to information?

The Answer. Right now, it’s unclear. Probably no one owns it. The Library of Congress did the most work, and, by law, their work is free to all. And anyway, the record is composed of facts, which can’t be copyrighted.

Come February, however, it will acquire a new owner, an organization known to few Americans and accountable to fewer, the Online Computer Library Center (OCLC) of Dublin, Ohio. The contribution came late–after the Library of Congress had created the base record they uploaded it to OCLC so other libraries could have access to it. And their contribution was minimal–warehousing 1k of data and sending it over the (free) internet. And for that work they were very well paid, both directly and for the services they offer on top.

OCLC’s new license purports to offer carrots to libraries. But it’s mostly carrots from their own gardens. And it comes at a steep legal price, transforming the legal relationship between librarians and their labor, and making everyone else come begging to Dublin for information about books. OCLC will be asserting a perpetual, retroactive and explicitly viral license over the records–as good as ownership. The OCLC policy that will cover many if not most library records in the world, even at the LC and other national libraries, and is designed to spread to derivative works.* All use will be on OCLC terms–which, of course, like any such license, they can change at any time. The terms shut down the Open Library, a giant open-data cataloging project sponsored by the non-profit Internet Archive. And they shut down all commercial use of records–including LibraryThing’s, unless we go through their new owner.

Petitions. If this bothers you as much as it does me, check out the Stop the OCLC Powergrab Petition, put up by Aaron Swartz, Tech Lead at Open Library. Aaron also wrote an excellent blog post about the the issue.

If you’re a librarian, check out Elaine Sanchez’s Petition for OCLC to Collaboratively Re-write Policy for Use and Transfer of WorldCat Records.

BTW: Don’t worry too much about LibraryThing. One way or another we’ll get through this. More and more I’m confident either the Policy will change and OCLC will embrace and lead a future of openness and collaboration, or opposition to it will create what OCLC is trying to prevent—a free and open repository of high-quality bibliographic data.

*There are millions of “OCLC-derived” records at the LC. I think I’m going to write my next post trying to figure out what the Policy means for the LC and other federally-funded libraries.

Labels: moose, oclc

Sunday, November 23rd, 2008

LCSH for “Yo mama”

A recent dust-up on AUTOCAT revolved around a librarian tour to Cuba for the “Havana Book Fair.” This “fully escorted” tour involved the opportunity to “get an unprecedented look into issues of freedom of expression directly from Cuban intellectuals, writers, librarians, publishers and curators,” with a rum-and-coke event at a local Committee for the Defense of the Revolution, who, besides keeping files on everyone in the neighborhood, “ensure[s] that detailed electoral information is provided on all candidates, and every vote diligently counted.”*

As you may guess, a number of posters (myself included) criticized the post. Others objected to our criticism, and a small-bore kerfuffle ensued.

It was interjected, with clever use of Library of Congress Subject Headings (LCSH):

“… before this detours into a “Cuba $x Foreign relations $z United States” (and vice-versa) discussion, please remember that Autocat is primarily a discussion group for cataloging, authority work, etc.”

Off-list, I suggested to someone that we could continue the argue entirely in LCSH, suggesting the (invalid) heading:

Cuba, Communist — Propaganda — Aimed at librarians!

Which was met with the (also invalid):

United States — Imperialistic policies — Social aspects


That got me thinking, if LCSH is a language (of sorts), how good is it for that most important role of languages—conveying insults?

The answer is—just great! Although LCSH lacks the term “jerk” or “dumbass” (except “Dumbasses (music group)”), it is still a rich field for insult, innuendo and invective. Consider, for example, hurling the following at an opponent:

Donkeys — Genealogy
Dill weed — Specimen

Sometimes the main heading themselves provide good insults, for example, to accuse someone of verbal diarrhea one need only employ:

Anal language — Case studies**

But it’s useful to take full advantage of the free floating form subdivisions. To tell someone they had descended to the depths of idiocy, I suggest

Stupidity — Bathymetric maps

Can anyone come up with the ultimate LCSH put-down?

*The passage goes on to note, that, “Voting booth and ballot integrity” in this one-party state is “entrusted to primary level students on voting days.” What a neat solution!
**Apparently this heading is only supposed to be used on the Anal people, of Southeast Manipur. Pity.

Labels: humor, LCSH

Thursday, November 20th, 2008

OCLC Policy Re-re-released, now in unfriendly PDF

After releasing their new records policy, pulling it back and re-releasing it, I put together a much-appreciated simple “diff,” using MediaWiki’s history feature. It was easy to do, once someone found a cached copy of the original, since both were HTML documents.

Now OCLC has released a third version of the policy, this time in PDF. The new version is harder to manipulate. (Hasn’t anyone at OCLC read Jacob Nielsen.) Adobe PDF and Preview mangle and rearrange the text when cut-and-pasted.

Is there any kind soul out there who wants to whip the text into shape and post it on the wiki page so we can do yet another diff?

Update: After wrangling with the text—sent by various people—it looks like this is going to be very hard to do. The text differs in all sorts of minor formatting ways that throw off the diff. Besides, OCLC will probably just release another version next week—no doubt in JPEG, or carved, like the Behistun Inscription, where only the gods can read it.

Labels: oclc, worldcat

Wednesday, November 19th, 2008

Gluttony, reloaded

On the plane to talk at the Minnesota Library Association conference, I dug into my paper copy of Information Today, and flipped to Steven Cohen‘s regular column, “Library Stuff Revisited.”

Steven’s topic this time was ReloadEvery, a Firefox plug-in that allows you to automatically reload a browser page at given intervals. He recounts how he uses ReloadEvery on different services, including keeping up multiple company press release pages all day, refreshing them automatically at fifteen-second intervals. Most remarkable was Steven’s scheme to grab first-place reservation on Southwest:

“I could have set up the page to reload every second, but I was nervous and didn’t want one tab to freeze on me. So I set up five tabs with the same page and had them each reload every four seconds at different intervals.”

It’s all very clever, but refreshing every second—who said that was okay? As a web developer of a site that gets hurt by more modest refreshaholics, I think Steven and the people who made ReloadEvery need to confront the “All You Can Eat Rule”: Just because it says “All You Can Eat” doesn’t mean you can shovel smoked salmon into your handbag for later, or lie on the drink counter with your mouth under the orange-juice spigot!

Labels: reloadevery

Friday, November 7th, 2008


A propos of Jonathan Rochkind’s suggestion that MARC records contributed to OCLC include their own viral, but freeing, license…

Source: xkcd. Whether it’s it or me, I find xkcd funnier every day.

Labels: Uncategorized

Friday, November 7th, 2008

Open Shelves Classification Update: Looking for Data from Public Libraries

The Build the Open Shelves Classification group continues to work on the top level classification categories. The word is out and people are excited about the direction we’re headed! But we need your help if we want to continue to build this together. We are gathering information in two ways:

1) Group members are searching their public library catalogs by the Dewey numbers associated with our draft list of top level categories and reporting their findings here. This is giving us a good sense of which categories have the biggest representation in public library collections. So far we have good data from large urban public libraries, but we would like more diverse library data. Here is a sample of what we have seen so far:

Please continue to add to the online spreadsheet (you will need a Google account to view or edit) for the next two weeks.

2) As you can see from the chart, the books that are represented by our current top level categories only represent a fraction of their total library collection. We need public librarian volunteers to evaluate what subjects get used the most in their library. You can run a report in your ILS (integrated library system) on subject heading circulation statistics. If you don’t know how, ask your systems or technical services librarian. Be anonymous, we do not need to know WHO just WHAT. Thanks to all who have already begun to do this. You can post your results here.

Once we have this information, we can then evaluate if the working list of top level categories needs to be edited.

In other news, graduate students studying Information and Library Science at the Pratt Institute in New York are working on various cataloging projects involving Open Shelves Classification, such as concept mapping OSC to LibraryThing tags. We look forward to sharing their results here on the blog in a few weeks.
Finally, many libraries around the country are trying to create new, usable, efficient alternatives to Dewey. The Frankfort Public Library District is going Dewey free (with stickers to boot, see image above) using a modified version of BISAC. What do you think? As a collective, I think we can improve upon this (also, let us know what you think about this).

Labels: OSC

Wednesday, November 5th, 2008

OCLC Policy Re-released; Wiki shows changes.

After releasing their new records policy and pulling it back almost immediately. OCLC has released a revised version.

I took the original version and the revised version and put them into the LibraryThing wiki. Mediawiki software has an excellent “compare” function.

By clicking the link below you can see the new policy and the changes that have happened:

Obviously, I am posting both policies as an aid to understanding and commentary. OCLC retains the copyright and if they want me to take down the comparison, I will be only too glad.

Labels: oclc

Sunday, November 2nd, 2008

OCLC Policy Change

Here it is:

No comment, as of now. Frankly, I haven’t even read it. February is a long time away. Long enough to discover we’re okay, make a deal or copy OCLC from scratch using nothing but periwinkle ink and passionate book lovers’ time.

Update: Depressing analysis: Terry’s Worklog. Wow.

Update #2: The non-legal page remains up, but the legalese page was taken down very early this morning

“We are reconsidering some aspects of the policy. More information will be available in the near future.”

Damn. I wish I had remembered to copy and paste. Does anyone have the original text? (For example, in your browser cache? I browse cache-less, unfortunately.)

Update #3: See Inkdroid pointing out the “viral” nature of the policy. Over a few years the libraries that now get their data from the Library of Congress, bypassing OCLC, will find uninfected records increasingly scarce. They’ll be forced to join OCLC—or do all their own original cataloging.

Update #4: A librarian-blogger managed to take a snapshot before OCLC took it down, here.

Update #4: Does anyone get Publishers Lunch Plus? Apparently it has an article called “WorldCatFight.” I don’t know the terms on forwarding that, but if it’s legal, can someone send me a copy?

It would certainly be good if publishers got into this. In my fantasy, publishers “pull a reverse-OCLC” and require unlicensed distribution of records derived from their data. Publishers have want their data out there, not restricted, and since OCLC records often start at publishers, this would shut down OCLC’s data-monopoly plans.

Update #5: The terms kill off the Open Library project completely. Not only does it involve viral terms—terms that OL could enver accept—but OCLC libraries are prohibited from participating in anything that “substantially replicates the function, purpose, and/or size of WorldCat, for example for the purpose of providing cataloging services to libraries or other organizations.”

I think that means it kills Talis too.

Update #6: Edward Corrado has an excellent summary of some of the issues.

Update #7: Jonathan Rochkind wrote a good explanation of the difference between an open source viral license—designed to keep things open—and an OCLC viral license—designed to keep them closed. He also suggests a remedy—give OCLC a virus instead, by add an Open Data license to everything your library catalogs!

Labels: oclc

Tuesday, October 28th, 2008

OCLC deletes personal cataloging?

Something’s going on over at OCLC. And it looks very worrisome.

LibraryThing members who care about library data should gird their loins. Ditto those who support the Open Library project, and other efforts to free library data.

Note: Sorry I can’t give more details yet. I will when I can. So far it’s a mix of messages on AUTOCAT and phone calls I can’t disclose. Also, I’m figuring someone in the library world who has more access to OCLC communication will post about it soon. So far, no posts.

Updates: Will post ‘em here:

Labels: Uncategorized

Friday, October 24th, 2008

New: Recent library reviews widget

Following on the release of LibraryThing for Libraries’ new Reviews Enhancement, I’ve created a widget for libraries to show off their most recent reviews.

These are the three libraries that are live so far.

Recent reviews from High Plains Library District

Recent reviews from Los Gatos Public Library

Recent reviews from Mount Laurel Library

Update: Our Mount Laurel is having some trouble with book titles. We’ll fix it soon.

Labels: book reviews, librarything for libraries, widgets