Sunday, February 26th, 2006

2.5 Million tags!

LibraryThing just hit 1.8 million books and 2.5 million tags. Since we’re going to hit 2 million books soon, I’ll talk about the tags today.

As the tags accumulate, they are also generating a lot more value. Tags are mostly useful personally and statistically. Tags are often played up baselessly—as if a few scattered and general tags are of any use to anyone. For statistical purposes you need a LOT of tags, so frequency patterns can emerge and anomalous entries fade into the background. And tags are primarily interesting in concert, not by themselves. Because tags are non-heirarchical and often short, they lack the “context” of something like the Library of Congress subject headings. Other tags can provide that context.

That’s why the “tag similarity” algorithm takes many tags into account, favoring recommendations that match on more than one. Take the messy example of a mid-level book, T. E. Lawrence’s Seven Pillars of Wisdom. What the heck is that? Its all over the map—literature, WWI, Middle East, Ottoman Empire, Arabia, history, autobiography, memoir, etc. The recommendations try hitting many of these tags at once—books like Fromkin’s A Peace to End All Peace (WWI, Middle East, history, Ottoman Empire, etc.) and Robert Grave’s Goodbye to All That (literature, memoir, WWI). It’s not perfect—Edward Said’s memoir!—but it’s a hell of a lot better than any single tag could produce.

And, most importantly, every book and tag makes the statistics better.

Lastly, I wonder how LibraryThing’s 2.5 million compares. I’m sure Flickr and Delicious have many times that number. But what else is out there? Amazon has encouraged product tagging for about three months, and they have thousands of times the traffic. I wonder how well that’s going?

Labels: 1


Leave a Reply