Archive for June, 2006

Wednesday, June 14th, 2006

Introducing thingISBN

UPDATE: thingISBN is now also availabe in feed format.

Many of you are familiar with OCLC’s xISBN service. Give it an ISBN and it returns a list of “associated” ISBNs from WorldCat. So—xISBN’s canonical example goes—give it an ISBN for one edition of Dune, and it will return a list of ISBNs of other editions, in XML format. This is red meat for mashups. (Speaking of which, did you know about Talis’ Mashing up the Library competition?)

Today I’m releasing “thingISBN,” LibraryThing’s “answer” to xISBN. Under the hood, xISBN is a test of FRBR, a highly-developed, well thought-out way for librarians to model bibliographic relationships. By contrast, thingISBN is based on LibraryThing’s “everyone a librarian” idea of bibliographic modeling. Users “combine” works as they see fit. If they make a mistake, other users can “separate” them. It’s a less nuanced and more chaotic way of doing things, but can yield some useful results.

To use thingISBN, point your browser at a URL like this, replacing the ISBN as appropriate:

To compare xISBN and thingISBN add &compare=1

thingISBN vs. xISBN.
UPDATE: OCLC has disallowed comparison.
I’ve done some preliminary comparisons between the two services. The results are pretty interesting. For starters, OCLC has much broader ISBN coverage. The dataset is orders larger, and “regular people” just don’t own certain books. Where the data sets overlap, however, LibraryThing can contribute a lot, particularly when it comes to paperbacks and non-US editions.

Examples:

  • 031228884 (Elizabeth Cook, Achilles). Recently-published novel. OCLC and LibraryThing know about two ISBNs. LibraryThing adds two others, a UK hardback and a UK paperback.
  • 0553212583 (Wuthering Heights). OCLC and LibraryThing share 60 editions. OCLC alone knows 266. LibraryThing alone knows 32.
  • 0520071654 (Peter Green, Alexander of Macedon…). OCLC and LibraryThing both know this hardcover ISBN. LibraryThing knows the paperback, but OCLC includes the 1974 first-edition.*
  • 0310241448 (Lee Strobel, The Case for a Creator). OCLC and LibraryThing know of one hardcover edition. OCLC knows of no other editions. LibraryThing knows of seven others. Wow.**
  • 0393049841 (Jason Epstein, The Book Business). OCLC and LibraryThing share two ISBNs. OCLC knows one by itself. LibraryThing also knows one by itself, but it’s to Simple Pineapple Crochet. Yes, you read that right. I’m not sure where the error is from, but it’s either a pitfall of the “everyone is a librarian” system, or of LibraryThing’s occasionally ratty data.

Mashups? I brought out thingISBN in part to provide more grist for Talis’ Mashing up the Library competition. I was careful to make thingISBN’s output follow the conventions of xISBN, so that existing xISBN code could be reused. I’m looking forward to see if anyone does anything with it. (One obvious application would be as an addition to LibX, an open-source Firefox extension that leverages xISBN to help you find things in your library. Here’s an excellent screen cast of it at work.)

As usual, comments, criticisms, bug reports and feature requests are asked for and gratefully received.

The fine print. By using thingISBN you agree to the following terms and conditions:

  • thingISBN is available for non-commercial use only.
  • You cannot hit thingISBN more than once per second.
  • If you’re going to hit thingISBN more than 1,000 times/day, you must notify LibraryThing (we’d love to hear what you’re doing). This is the current policy. If thingISBN turns out to be a success I’ll optimize the code more, put it on my second server and allow it to be hit as hard as people want to hit it.
  • ThingISBN is provided “as is,” without any promises or guarantees. LibraryThing is not responsible for any errors in the data, damages resulting from its use, your teenager’s attitude or the state of the world generally.
  • We reserve the right to change these terms and generally make things up as we go.

*Stratch that. LibraryThing knows it now too. A user had it, but it wasn’t combined; I went ahead and combined it. Actually, Green changed a lot between editions, but they still qualify as one “work.” (This edition, with another ISBN, may also be the same work, but I’m not sure, so I left it.)
**I started look around to see if this disparity was true in general of religious books. I think it isn’t, or at least the effect isn’t as striking.

Labels: apis, frbr, thingisbn, works, xisbn

Wednesday, June 7th, 2006

The Long Tail

There’s an article in today’s New York Times, “What Netflix Could Teach Hollywood“, that’s essentially about the long tail of movies.* David Leonhardt writes about The Conversation, a Francis Ford Coppola movie from the 1970s, that,

“… was on its way to the movie graveyard just a few years ago. Since video stores have room for only a few thousands titles, some didn’t carry it, and it was slowly being buried under the ever growing pile of newer films at other stores. It would have been easy a decade ago to imagine a time when few people would ever watch “The Conversation” again.

Then came Netflix. The Internet company with the red envelopes stocks just about all of the 60,000 movies, television shows and how-to videos that are available on DVD (and that aren’t pornography). …

The result is a vast movie meritocracy that gives a film a second or third life simply because—get this—it’s good.”

The long tail is about going deeper than just the latest Hollywood summer blockbusters. Netflix demonstrates that people will, in fact, rent a movie that isn’t prominently displayed at their local video store (what local video store could stock as many DVDs as Netflix?), and that came out as long ago as—gasp—1974.

Similarly, people aren’t just reading the recent best sellers. Go deeper into the list, and you see that there are actually a lot of people who are reading the seemingly “less popular” books.

Yes, the top books on LibraryThing are the six Harry Potter books, followed by The Da Vinci Code. But look beyond the top 10. What about number 150? Margaret Atwood’s The Blind Assassin has 621 members. A whopping 358 members have Tender is the Night (clocking in at number 392). Go farther down the list. Even number 1,000, The Stars My Destination, has over 200 members.

Conversely, check out the “you and no other” on your fun statistics page. The amount of seemingly obscure books that other people have in their catalogs is mind blowing sometimes. Someone else actually has Tender Violence: Domestic Visions in an Age of U.S. Imperialism? And I don’t know them?**

A lot of people on LibraryThing pride themselves on the obscurity of their library. Tastes are broad, and, as it turns out, when we can reach beyond the popular, more recent stuff, we do. So Hollywood blockbusters and NYT bestsellers aside, maybe the mainstream isn’t so mainstream after all.

* The Long Tail was coined by Chris Anderson, whom, incidentally, Tim and I saw give a talk at BookExpo America a few weeks ago. (We also scored advance copies of his book, The Long Tail: Why the Future of Business Is Selling Less of More, which is coming out later this year).
**I should. They have a great library. Hi aiross!

Labels: Uncategorized

Monday, June 5th, 2006

Library Mashup Competition

The library vendor Talis just announced a library mashup competition.

It’s a pretty wide-open thing. You can use any source you want–Google, Amazon, OCLC, Z39.50*–and do anything you want, so long as it’s nifty. You don’t even need to work in a library (although the necessity of saying this is troubling!). About the only hard rule, is that you need to release it under some sort of copyleft license**. The winner gets £1,000, the runner-up £500. The contest ends August 18.

Best of all, I’m going to be one of the judges. This has a down side—I can’t enter myself. But it will be very fun. The other judges are a pretty august group.

Although I can’t enter it, I WILL do some mashups. More importantly, I’m going to start releasing APIs that others can use to build their mashups. As many of you know, I’m constrained by the Amazon API. Offering an API to the full LibraryThing data set would inevitably involve releasing Amazon API data. So I’m going to have to stick to ISBNs, LibraryThing codes (like the “work” number) and user-generated data, like tags and such. That shouldn’t be a problem, since user data is what LibraryThing is all about.

Talis has set up a discussion area for the contest itself, another for ideas and another for entries. But feel free to talk over here too, particularly as regards what data would be fun to extract from LibraryThing.

*I pushed for them to include that in the suggested materials list. Z39.50 is a TREMENDOUS resource, almost completely ignored because it’s a little wiggly to work with.
**The small print says “winning entrants will need to satisfy the judges as to the spirit and rationale behind their licensing decisions, prior to prize money being made available.” As someone who recently signed a financing deal, I’m getting very wary of small print***. I can tell you that I’m going to push hard to allow any copyleft license to apply.
***Abe managed to slip in an “Andorra” clause, a yearly tribute of wine, four hams and forty loaves of bread. That’s okay, my employment agreement mandates a water cooler filled with Guiness, in invisible ink on the back.

Labels: Uncategorized

Sunday, June 4th, 2006

WineThing?

No, I didn’t build “WineThing.” I did think about it once, shortly after LibraryThing was born. I figured I’d stick to books.*

But somebody did it, and did it rather well. The site is Cork’d (corkd.com). I was originally going to blog about WineLog, which also looks good, but I think Cork’d does it slightly better; it’s certainly more elegant (than LibraryThing too). From the Alexa numbers Cork’d and Winelog look locked in battle; it’s unclear who will hit the all important social-software “critical mass.”** There’s also a site called CellarTracker, which appears to hide a lot of functionality beheath the interface of a circa-1999 second-tier ecommerce site.

Cork’d doesn’t do very much yet***, but it does what it does well, and easily. I’m hoping that, as they gain users, they discover the same data richness LibraryThing did. Right now, Cork’d only has manual, user-to-user recommendations. Since I have no friends on the site, I’ll never get any. (Although can follow what I’m drinking via RSS!) I’m sure, when the data gets rich enough, they’ll be able to generate good algorithmic ones as well.

BUT, wouldn’t it be fun if you could link your LibraryThing and Cork’d accounts? Some very simple linking, and we could have “People who read The Life and Opinions of Tristam Shandy, Gentleman drink.. Gewürtztraminer!”

Sign up, and tell ’em you came from LibraryThing…

*Incidentally, dozens of people have told me they thought of LibraryThing too. The work is in the doing.
**It’s strange they launched so close together. The same happened with LibraryThing and Reader2. Reader2 actually won, briefly, so while the social-software piling effect is important, functionality can trump it.
***Why can’t I upload the label? Wineries would fill it with content in days.

Labels: Uncategorized

Thursday, June 1st, 2006

Intaglectuals 1: Kevin Kelly

I’ve been meaning to write up my thoughts on what I heard at Book Expo America or listened to recently online—Kevin Kelly, Chris Anderson, David Weinberger and (maybe) Carly Fiorina. This started out as one big compare-and-contrast blob. I’d better split it up.

Kevin Kelly: What Will Happen to Books? As many of you know, Abby and I recently attended Book Expo America, promoting LibraryThing with Abebooks. BEA is a very “miscellaneous” affair—embracing everyone from authors to printers, agents to librarians.

If there was a unifying meme it was the need to react to Kevin Kelly’s just-published “manifesto” “What will happen to books?” (New York Times Magazine, May 14). The general feeling was “This guy’s a nut,” with an undertone of anxiety—What if he’s NOT a nut? What if I just don’t “get” it? What if I’m a dinosaur?

I generally find myself on the “left” of these issues. I think things have happened or are happening now—the web, Google, blogs, open source, book scanning, wikis, tagging, mashups—with ramifications for intellectual life in general and book publishing in particular. I even think—don’t laugh—LibraryThing has a tiny part to play in these changes.

So it’s odd to find someone to the left of me. That Kevin Kelly guy’s a nut! The article fairly bristles with overreaching, but I’ll single out a quote that makes me embarrassed for LibraryThing:

“The link and the tag may be two of the most important inventions of the last 50 years.”

The link, okay—particularly if link is metonymous with the internet in general—but the TAG?!

It’s too early to tell, but I’d be hesitant to add even something broader, like “user generated data (and metadata)” to the top 100 inventions of the last half-century. I mean, what do you bump? Genetic engineering? The Pill? The satellite? The one-click patent?

Am I a dinosaur?

*Of course, although books and tags were central to his the article, he didn’t mention LibraryThing. That’s life.

Labels: Uncategorized