Short verson.I’ve just completed a major change in the “substructure” of LibraryThing’s data, the “works system” that links different editions together. The system is better and will allow more betterness down the road. It was the reason we were down most of last night. We regret that, but think the change will prove worth it.
Long version—What are “Works?” LibraryThing’s work system brings users together around the books they’ve read, not the peculiarities of publisher, format or even language. Works are created and tended by members, who “combine” editions together into works. Anyone can do it, but the die-hards created a large and active group—Combiners!—to trade tips, debate philosophy, muster effort—and complain about the system!
Combiners is a remarkable community, and one that has gone without a nod from me for some time. I hope these changes encourage them, and the prospect of future improvements built on surer footing.
Since the beginning I’ve promoted the idea of the “cocktail party” test.* This test answers whether two books belong to the same work by asking whether their readers would, in casual conversation, own up to reading the same book or not. So, for example, in such a context it wouldn’t matter if you had read a book in its hardcover or paperback edition, or listened to it on CD. If the cute girl with the backless dress mentions she’s fond of the Unbearable Lightness of Being, the edition is immaterial (but see this link). I also suspect that title differences occasioned by marketing considerations—eg., Harry Potter and the Philosopher’s Stone (UK) vs. Harry Potter and the Sorcerer’s Stone (US)—wouldn’t matter. Nor should language itself matter; few would turn a cold shoulder to a Finnish Tolkien fan merely because he read Tolkien in Finnish.**
What’s Changed? The core concept used to be that a work consisted of a discrete set of title-author pairs. We chose title-author to emphasize the loose, verbal nature of the cocktail party test, and because ISBNs are much less perfect than many believe.*** These title-author pairs we called “editions.”
Unfortunately, there are a small number of works that can’t be identified based on title and author alone. This happens particularly in science fiction and graphic novels. (Apparently the Fantasy Hall of Fame currently entombs two distinct works—same title, same authors but different contents and publisher. Someone should be punished for that.) My bête noire are Cliff’s Notes filed in with the works they “interpret.” No appletini for you “Great Expectations”!
The system still automatically assigns new editions based on author and title. But I’ve added ISBNs to the mix, so members can combine and separate editions looking at and according to their ISBNs.
- Title-author-ISBN bundles are now distiguished by the smallest details, so you can separate “Hard Times” from “Hard times” from “Hard times” with a period at the end. It has vastly increased the number of editions in the system. (There are now more than 1,200 editions of the Hobbit!) This is was mostly a technical decision.
- The original system produced a few “hash collisions,” utterly different books thrown in together unhappily. This has been a long-running defect—and complaint. The new system will allow their separation, although existing ones will need to be separated.
- The Combination and Debris (renamed “Editions”) pages should be faster. Some will start—and stall!—on a message about updating edition information. Once the editions have been calculated, the page will be faster.
As mentioned above, the new system was responsible for our extended downtime last night. Between a few mistakes and a database just shy of 27 million books, it took longer than we thought. I hope that the changes prove worthwhile in and of themselves.
Being much better designed, the new system should enable:
- Edition-level pages
- Edition-to-edition and work-to-work relationships
- Member and book matching that takes editions into account
- An end to the “dead languages” exception to the cocktail party test.
- More opportunities for me to discuss the Pop-Up Kama Sutra at library conferences.
I’ve created a Talk thread for members who want to discuss the changes.
*Perhaps wishing I’d get invited to a few more cocktail parties! Speaking of which, are you going to Book Expo America 2008 in Los Angeles? We are.
**Whether you choose to avoid the Finnish Tolkien fan at cocktail parties is, of course, up to you.
***In fact, publishers recycle ISBNs, steal ISBNs, make up ISBNs, print wrong ISBNs, apply ISBNs to large sets of seemingly discrete items and otherwise abuse the system all the time. Most of the time they work in a bookstore context. They aren’t really fit for a project of LibraryThing’s size and scope.