Archive for February, 2007

Tuesday, February 27th, 2007

Many Eyes does the LibraryThing

Many Eyes, a very shiny new visualization site* is featuring a visualization of LibraryThing’s top 50 books Harry Potter is Freaking Popular. Yes he is.

It might be interesting to chart other LibraryThing data in Many Eyes. I’ve only scratched the surface of it, but it looks quite powerful.

*In alpha, which is the new beta.

Hat tip: David Weinberger.

Labels: 1

Tuesday, February 27th, 2007

Author disambiguation notices

I’ve added the ability to add and edit author disambiguation text. This isn’t the “real” solution, which is still coming, but if you have the urge to clarify the difference between Steve Martin the author of Shopgirl and Cruel Shoes and Steve Martin the author of Britain and the Slave Trade, go ahead. It may help someone out (“This isn’t funny at all!”) and it will help us later when we have real disambiguation pages.

Steve Martin doesn’t have one yet, but Christopher Locke does.

Results show up in the Helpers log.

Labels: 1

Tuesday, February 27th, 2007

SocialCatalogers: For people who make social cataloging applications

Introducing SocialCatalogers, a Ning-based social network for people who make social cataloging sites. No, this is not a joke. It’s funny, but it’s not a joke.

If you make social cataloging sites, or have a deep interest in them, join up here: http://socialcatalogers.ning.com.

Social cataloging has exploded. Today LibraryThing’s list of competitors—very broadly defined—hit forty.* That’s not counting the dozens or even hundreds of sites in other niches—movies, games, comics, programs, wine, beer, recipes. It’s not counting the swap sites, which catalog as a means to something else. Or the list sites like 43Things and Wordie, which catalog intangibles.** Or projects like John Blyber’s “Social OPAC,” which bolts social cataloging onto “unsocial cataloging” (a service LibraryThing will offering soon too).

For a while the social cataloging social network was me. I started LibraryThing by trying to get Bibliophil to join forces with me—they would provide the social, I’d provide the cataloging. No dice. Since then I’ve emailed or met with half the social cataloging developers out there, looking for synergies or just to talk shop. Sometimes something came out of it; LibraryThing has “also on” integration with a few, like Cork’d, “LibraryThing for wine.” (Some day, if we have enough shared users, LibraryThing can recommend books based on the wines you drink!) There should be more of that.

The time has come for a socal cataloging watering hole. We’re an industry now, or a dwarf industry anyway. Some of us compete, but that doesn’t prevent automakers from getting together. We too can conspire to fix prices! Seriously, we ought to have some things to talk about. At the very least we can keep an eye on the competition.

I made the social network on Ning, which relaunched today. Ning*** is a “social network maker,” started and mostly funded by Mark Andreeson. I wasn’t that impressed with it, until the relaunch. It’s really something now. I was able to create a basic social network in about ten minutes. It’s not what I would have designed, but I would have taken a month to do it, at least. 60% in ten minutes beats 100% in a month every time.

Also, by getting in early, SocialCatalogers hopes to become the dominant social network for people who make social cataloging applications. Take that CatalogingSocial.com, SocietyofSocialCatalogers.com, SocialCatalogingThing.com, ThingSocialCatalogers.com, SocialCatalogersList.com, SocialCatalogersster.com, Joptwix, Flipto, Gropo and Fhtagn****.

*I found StashMatic, which is similar to Squirl and iTaggit. (Squirl is my pick, and I’m not just saying that because half the development team now works for LibraryThing.) And I found JunkLog, which brings minimalism in social cataloging to a new level. That’s not really a knock. It’s kind of cool to strip it down. I can’t tell if it’s developing or defunct.
**Early on in LibraryThing I was at my parents house for the weekend, and my dad came into my room at 5am, fresh from bed. He had an idea he was dying to tell me about. He had an idea—LibraryThing, but for people! Instead of cataloging your books, you list your friends. I let him down easy. (Still, a more catalogy social network would be an interesting project.)
***Mostly because it got so much press but has LibraryThing-level traffic. Then again, I’m Dan Quayle to Andreeson’s John Kennedy. I expect the relaunch to kick Ning into the clouds.
****As every lover of H. P. Lovecraft knows, Fhtagn comes from Ph’nglui mglw’nafh Cthulhu R’lyeh wgah’nagl fhtagn! (“In his house at R’lyeh dead Cthulhu lies dreaming”). Fhtagn.com, .net., .org., .de and .edu are taken. However, phngluimglwnafhcthulhurlyehwgahnaglfhtagn.com is still available, even if phngluimglwnafhcthulhurlyehwgahnaglfhtagn.net is not.

Labels: Uncategorized

Monday, February 26th, 2007

Compare your library with LibraryThing

LibraryThing’s gone feed-crazy! Check out Thingology for info on a new feed for comparing a library—a library library—with LibraryThing.

Six posts in 24 hours. Stop me before I blog again!

Labels: 1

Monday, February 26th, 2007

New feed: Compare your library with LibraryThing

Over on Next Generation Catalogs for Libraries, NCSU‘s Emily Lynema, asked me:

“Do you have any idea of the coverage of non-fiction, research materials in LT? Have you done any projects to look at overlap with a research institution (or with WorldCat)?”

No, we haven’t. And I’m dying to find out, both for academic and non-academic libraries.

So I put together a feed of all unique LibraryThing’s ISBNs. With a little work, library programmers should be able to compare them against their holdings.

If you’re not up to the task, but still want to find out how LibraryThing compares to your library, you can send me a file with ISBNs—just ISBNs or a more detailed dump—and I’ll do the comparison.

See our Feeds and APIs page for the file, AllLibraryThingISBNs.xml.gz.

Complications and opportunities:

  • I included only valid ISBNs.
  • It’s a week or two old.
  • About 20% of LibraryThing books have invalid or no ISBN. Many of these have LCCNs. I suspect a high percentage are library-ish books.
  • I have turned all ISBN-13s in 978 format into ISBN-10s. There are a few bogus ones too, including the valid but numerically absurd 0000000000. (Bowker should auction that one off!)
  • There can be little doubt that LibraryThing is stronger in paperbacks and weaker in the formats libraries collect. It would therefore be very useful to run all ISBNs through OCLC’s xISBN service*. (By definition, they’re not going to be improved by running them through xISBN’s chief competitor alternative service provider, thingISBN.) Unfortunately, I can’t run them through xISBN on my own.
  • The feed is available for non-commercial use only. That basically means libraries and hobbyists. Other use is expressly prohibited.
  • I am guessing the overlap won’t always be that impressive as a percentage. But these are the books people think enough of to own. They’re going to move more than other library books.

I’m looking forward to what people find out!

*Which is moving, but will not break.

Labels: Uncategorized

Monday, February 26th, 2007

New feed: Wikipedia citations

I parsed the English Wikipedia looking for ISBNs and came up with a public feed of ISBNs to articles.

Over on the LibraryThing blog, I posted about it. LibraryThing is now showing the results on work pages, but others should feel free to use the feed for their own ends. (Cause trouble. Put Wikipedia in your OPAC!)

Enjoy.

Labels: Uncategorized

Monday, February 26th, 2007

O’Reilly Radar on tagging

The O’Reilly Radar Blog does a nice post on my When tags work and when they don’t: Amazon and LibraryThing.

O’Reilly blogger Brady Forrest notes that much of what I said echoed Joshua Schachter of Del.icio.us:

“You have to understand the selfish user – user #1 has to find the system useful or you won’t get user #2. Systems that only become useful when lots of people are using them usually fail, because there’s no incentive for people to contribute themselves.”

So now would a good time for me to say I’m not claiming my insights are all my own. Most of this stuff has been in the air for a while. While I’d never seen that piece by Schachter, I’ve seen similar statements by him and others. (And there’s a good short panel taped with him and David Weinberger.) I’d love to talk tags with these guys. No doubt like Wikipedia founder Jimmy Wales, Schachter’s so swamped he’s not even answering Bono’s email (source: net@night).*

A few corrections:

  • AllConsuming is already owned by Amazon. That is, Amazon now owns two of LibraryThing’s competitors!
  • Shelfari allows tagging. Honesty hurts.
  • As good as “CEO” sounds, I’m only the president. LibraryThing is an LLC.

Labels: Uncategorized

Monday, February 26th, 2007

Wikipedia citations, with feed

Update: Changed feed URL.

I’ve added a cool new feature, building on some work by library programmer Lars Aronsson—Wikipedia citations to all works pages. That is, work pages now list of all the Wikipedia articles that cite the work. The data is also available in feed form.

Here’s how it goes. At the top of J. F. C. Fuller’s A Military History of the Western World it lists how many citations, with a link:

And, down below, it shows all the articles:

How we I did it. Basically, I did a complete run through the Wikipedia dump files (source), parsing out anything that looked like an ISBN and checking if it is. It’s pretty easy. So it sees:

Fuller, J.F.C. A Military History of the Western World. Three Volumes. New York: Da Capo Press, Inc., 1987 and 1988. — v. 1. From the earliest times to the Battle of Lepanto; ISBN 0-306-80304-6: 255, 266, 269, 270, 273 (Trajan, Roman Emperor).

and gets the ISBN. I’ve started in on the harder problem, parsing books without ISBNs, like:

Bowersock, G.W. Roman Arabia, Harvard University Press, 1983.

It’s not actually that hard. But it’s fiddly. And it’s one of those problems where each additional percent of accuracy costs 50% more effort.

What’s the most cited books? The most cited book on Wikipedia is… The Official Pokemon Handbook. Surprised? Don’t be. In fact, eighteen of the top twenty most-cited works are Pokemon books. It boggles the mind. Somebody, or a bunch of somebodies went ISBN-happy on all the Pokemon entries. Fortunately, the existence of so many citations to Pokemon does not impair the quality of the rest. It’s just… Wikipedia. There’s a decidedly quirky character to many of the other winners, testimony to some serious passions. Number 28, with 177 citations, is Richard Grimmett‘s Birds of India, Pakistan, Nepal, Bangladesh, Bhutan, Sri Lanka and the Maldives. I think this effect would be diminished a lot if non-ISBN books were added.

Where did this come from? I owe the idea to Lars Aronsson, who came up with a simple script and ran it against the Wikipedia dumps and posted the results on Web4Lib back in September. I wrote him soon after to see if he was going to provide a public data feed, or if he minded if I did. He did not. His results differed a bit from mine. I’ll be in touch with him to square the differences.

Unfortunately, the Wikipedia data is not updated as often as one might like. The most recent is from November of last year. I’ll keep an eye on the download page, and reparse the data when a new dump comes available.

What’s this about a feed? We’re big fans of openness. And it’s Wikipedia data anyway. So we’ve made a feed of it. You can get it here:

http://www.librarything.com/feeds/WikipediaCitations.xml.gz

UPDATE: I changed the URL and gzipped it. Needness to say, I’m not putting any restrictions on this, but if you do something cool, I’d love to hear about it.

As usual, tell me what you think.

*We’ve seriously considered open-sourcing LibraryThing. But given the state of the code, it would be, as Nabokov said of rough drafts, like passing around samples of our sputum. We may out-source pieces of the code—the pieces we’re happiest about.
**LibraryThing is in the odd position of having almost as much bot traffic as we have person traffic. Google loves us. Guys, you love us too much!

Labels: 1

Monday, February 26th, 2007

Introducing the Helpers log

Update: Author links added. See below.

I’ve added a new page, the Helpers log, that tracks the various ways users help LibraryThing and each other—work, author and tag combinations, author picture and “author nevers.” (John will add author links tomorrow.) The new page will make it easier for eagle-eyed Thingamabrarians to watch over what’s going on with these critical activities, and smite miscreants.

By the way, did you know we are averaging 2,000 work-combination actions per day? Per day, folks! That’s not even works combined, which is higher since a combination will have at least two and and high as twenty. It boggles the mind.

This isn’t a small thing. You guys have up-ended the world of book data. And we’ve only just begun.

Update: Author links added. Unfortunately, we weren’t storing the right data for author links. So it’s only showing ones added since we fixed the system. It also means we don’t know who added links before, not exactly anyhow. Again, apologies.

Labels: 1

Friday, February 23rd, 2007

Not the Ninja!

I’m loathe to take the last post, When tags work and when they don’t: Amazon and LibraryThing, off the top. It got a *lot* of attention*, and I owe the commenters a follow up.

The Shifted Librarian found this YouTube video “put together by Steven Reed’s students at Wilmington High School.”

It’s a fun video, no question. It’s an *amazing* demonstration of what kids can do these days. My highschool had the best Super8 program in Massachusetts, and this level of professionalism would have been way past our capabilities. The book-throwing is great. The editing is quick and professional. The kids get an A. They rock.

But I can’t leave it at that. The kids are rock stars, but the message is all wrong—and it’s wrong in a very telling way.

The situation is completely false. I don’t mean the ninja—they’re increasingly common in libraries of all sizes—but the contest and its results.

Type Capital of Russia into Google and you get this:

You don’t even need to GO to a page—the answers are in the page titles themselves. Face it, the Web is *great* for this sort of thing. You’re not going to “defend” books by claiming they’re better for looking up trivial facts. They’re not. Breathe deep and repeat after me: They’re worse.**

The second false idea is that libraries and the web are rivals, two competing ways to get the same thing (which is mostly factoids). This is all too often how popular culture sees libraries, and it’s a disaster. If libraries are just low-tech search engines, they are bad ones. They should be defunded and closed.***

I’m not going to launch into a defense of libraries and books. Of course I love them. I started LibraryThing because I loved them so much. But I don’t love them because I hate computers, or because books are better than computers.**** I don’t see them as rivals. The web has supplanted a few things that books used to do, but not the important ones. And libraries can do things with computers they are only just starting to explore.

People who love books need to fight against these ideas. They’re a trap. They’re wrong, and they’re very dangerous to the things we love.*****

Yeah, I know, lighten up, Tim!

*Alexa is under the impression LibraryThing’s traffic doubled the day the post hit. That’s total nonsense and a great proof of Alexa’s failings. It makes me wonder how much they rely on new link creation, not traffic.
**Where did http://www2.blogger.com/img/gl.link.gifthat guy find the books? Does he have the shelving memorized, or did he consult and OPAC to find Countries of the World?
***And don’t tell me it’s about not everyone having a computer. If so, libraries should be just computer centers.
****For starters, books are worse at email, worse at social networking, and they are hands-down a lousy way to blunder upon shocking new types of pornography.
*****Can anyone help me find a quote? I think it was from random sci-fi movie or show, taking place in the near future. The quote was something like “Would you have wanted to shut down the internet just to keep the libraries open?” Don’t even try to Google it. (And I recommend not going to the library either.)

Update: This ninja movie has a good message.

Labels: Uncategorized

Thursday, February 22nd, 2007

Author links


As many have noticed, you can now add links to author pages. It’s part of an ongoing effort to give members more control over the site.

We’re still breaking in a new development environment and a related system (Subversion, for the tech-curious) of moving new stuff from development into production. As a result, the author links feature was launched a bit prematurely. That turned out to be not such a bad thing—a bunch of people immediately jumped in and started suggesting improvements, and the feature, minor though it is, was completed faster and better than it would have been otherwise*. Many thanks to everyone who helped troubleshoot, and to everyone who has contributed links.

One thing you’ll notice is that most authors already have a link to their Wikipedia pages, some of which say “unconfirmed” in parentheses. This is a side effect of a script we wrote to go through all the page titles on Wikipedia, match them against all the authors in LibraryThing, and create links. Which works great 90% of the time, but it turns out there are a lot of people in the world with the same name. To us, Alexander Robertson is an author. Our Wikipedia script, though, thought he was a British cop. To rectify this, we had the script say “unconfirmed” next to every author link that hadn’t yet been verified by a human being. So, if you want to be that human being, please check unconfirmed Wikipedia links when you come across them, and either confirm or edit them, as necessary (both options are available by clicking ‘edit’ in the links header).

Adding and confirming links turns out to be quite addictive—I’ve been working through the list of Nobel Prize winners, adding links to the author page of every winner, and reading half the bios in the process. If anyone can suggest other good link sources, please do so in the comments, it would be cool to have a somewhat organized effort to enrich the pages.

* I didn’t actually have anything to add, but I feel like I should throw in a footnote or two. Seems to be LT style.

Labels: 1

Tuesday, February 20th, 2007

Tagging: LibraryThing and Amazon

I just posted a very long examination of tagging on Amazon and LibraryThing, and what it means over on LibraryThing’s “ideas” blog, Thingology. I’m hoping it gets noticed. Although quite imperfect, I think it’s the first time the failings of “commercial” tagging have been brought to light, and their implications thought through.

Labels: 1

Tuesday, February 20th, 2007

When tags work and when they don’t: Amazon and LibraryThing

This is an extensive post, revealing the results of a statistical comparison between Amazon and LibraryThing tags, and exploring why tagging has turned out relatively poorly for Amazon. I end by making concrete recommendations for ecommerce sites interested in making tagging work.

Both LibraryThing and Amazon allow users to tag books. But with a tiny fraction of Amazon’s traffic, LibraryThing appears to have accumulated *ten times* as many book tags as Amazon—13 million tags on LibraryThing to about 1.3 million on Amazon. (See below for the method I used to find this out.)

Something is going on here—something with broad implications for tagging, classification and “Web 2.0” commerce. There are a couple of lessons, but the most important is this: Tagging works well when people tag “their” stuff, but it fails when they’re asked to do it to “someone else’s” stuff. You can’t get your customers to organize your products, unless you give them a very good incentive. We all make our beds, but nobody volunteers to fluff pillows at the local Sheraton.

A tale of two tagging sites.

LibraryThing began on August 30, 2005. From the start, we allowed members to tag their books. We showed that people could embrace book tagging, much as they had photo and website tagging. But LibraryThing was a marginal player.

Three months later, Amazon unveiled its tagging feature. This was big deal in certain quarters. To many, Amaon’s move signalled that tagging had “arrived.” As CNet blogger Daniel Terdima wrote:

“[This] may well prove to be the most visible example of a company incorporating tags as a way to bring order to information. Outfits like Flickr are big and have tremendous followings, but nothing compared to Amazon’s.”

Amazon’s size was key. With something like 60 million registered customers, and one of the highest traffic sites out there, tagging at Amazon must have seemed like a sure bet. It’s visitors were a firehose. Point them at tagging and KABOOM!

Amazon’s tagging was quick and easy—but would it work?

It didn’t work out that way. Amazon visitors have not taken to tagging Amazon’s books in significant numbers. With thousands of times the traffic, Amazon produced a tenth as many tags as LibraryThing. What’s going on?

In fairness, Amazon didn’t give tagging a lot of prominence. Tags were stuck in the middle of their ever-lengthening book page—one section for adding your own tags, another for showing others’ tags. They didn’t push them very hard.

It’s likely Amazon could have done better. A higher profile could have increased familiarity and comfort with the feature. Some user-interface tweaks could have enhanced its appeal. Maybe Amazon will make changes, and Amazon tags will get some traction.

But there’s a general message in this: If Amazon with its unsurpassed traffic is having trouble, can other ecommerce sites hope to make tagging work?

Numbers matter

Amazon’s shortfall matters. To do anything useful with tags, you need numbers. With only a few tags, you can’t conclude much. The tags could just be “noise.”

A web of meaning: LibraryThing’s tag cloud for Guns, Germs and Steel.

Take one example: LibraryThing users have applied over 3,900 tags to Jared Diamond‘s Guns, Germs and Steel, including “apples,” “office” and “quite boring.” With just a few tags, it might be thought a desert cookbook, a business book or—worst of all—a boring one. But these are all single-instance tags. With a larger number of tags, clear patterns emerge, with high-level descriptors like “history” (755 times) and “anthropology” (293 times) standing out clearly against the noise. Even lower-frequency tags, like “social evolution” (25 times) and “pulitzer prize” (20 times) can be trusted as relevant.

Large numbers are particularly important when looking for best examples for a given tag. Go by numbers alone and you just get what’s popular. By the numbers Guns, Germs and Steel, tagged “evolution” 39 times, is the number ten book on evolution. That’s crazy. By looking at “tag share,” LibraryThing can understand that Ernst Mayr‘s What Evolution Is is a better choice. Although tagged “evolution” only 25 times, those constitute a much larger percentage of its tags. (See the LibraryThing tag page for “evolution.”)

Critical mass is important, even if we can’t pinpoint the line. Ten tags are never enough; a thousand almost always is. Unfortunately, Amazon’s low numbers translate into a broader failure to reach critical mass. With ten times as many tags overall, LibraryThing has fifteen times as many books with 100 tags, and 35 times as many with over 200 tags.

ISBN tag distribution for A Farewell to Arms. Doubles as an example of my Excel-fu.

The “problem of small numbers” is compounded by Amazon’s failure to aggregate tags accross editions. In Amazon, the tags for the various paperback, hardback, British, French and German editions of a work are all in separate “buckets.” LibraryThing’s unique user-generated “works” concept combines editions and their data, compounding tag statistics. Thus, Amazon’s top edition of A Farewell to Arms has 28 tags, where LibraryThing’s has 716. But when all of LibraryThing’s editions are combined together under a work, LibraryThing has 1,914 tags—68 times as many! A Farewell to Arms is a very well-known book, but Amazon’s 28 tags can’t mean much. With 1,914 tags, LibraryThing has a truly extensive “web of meaning,” created by its members. You can do a lot more when the data is so rich.

Why Ecommerce Tagging Fails

Amazon’s tagging suffers a failure of incentive. The causes are multiple:

  • Tags work best when they’re about memory, so tagging makes the most sense when you have a lot of something to remember. On LibraryThing, members with under 50 books seldom tag, but users with 200 or more usually do. When you get right down to it, few of us need to remember 200 books on Amazon. For most of us, the “wishlist” feature is good enough. We don’t need to sub-segment out the “anthropology” books.
  • When you tag on LibraryThing, you’re putting your library in order. The pleasure and use is not unlike reshelving your books the way you want them, except that tags can draw together books that must otherwise reside separately on the shelves. And tagging on LibraryThing is connected to a social system
    —tag something “anthropology” and you’re connected to all the other anthropology buffs out there.
  • Amazon is a store, not a personal library or even a club. Organizing its data is as fun as straightening items at the supermarket. It’s not your stuff and it’s not your job.
  • Amazon underplays the social. Tagging really kicks into high gear when the personal blooms into the social, when organizing your web pages or your books turns into an hours-long exploration of others’ web pages and books. But Amazon doesn’t want you to hang out—they want you to buy! Tags on book pages do not list their taggers. You need to click around a lot before the tags turn into people. (The failure is particularly surprising in light of Amazon’s clear grasp of social software. Amazon got “social” years before it was trendy. What are reviews and Listmania but social sharing and user-generated content?)
  • Users don’t “own” their tags. There is no way to export them. Considering how central APIs are to Amazon—and to it’s success—this comes as a surprise. (I’m guessing they’ll add this eventually.)

The problem of opinion tags

Some of the tags from Ann Coulter’s Treason. But what is it about? (Compare with LibraryThing’s page.

The limited utility of tagging on Amazon produces an unintended consequence—a surfeit of “opinion tags.” So, Daniel Silva’s The Unlikely Spy gets “wow what a book” and Nick Hornby’s High Fidelity gets “good” and “good book.” Not infrequently, opinion outnumbers other types of tags. Five of the seven tags applied to Bette Green’s Summer of My German Soldier are opinion tags, incluing “aweful” (sic), “obnoxious,” “pathetic,” “stupid” and “wonderful story.”

The takeover is total with political books. Ann Coulter’s Treason gets a lot of tags like “craptacular,” “evil” and “brain dead.” Coulter’s tag-defenders weigh in with “you won’t disprove the facts,” “you can’t disprove the facts,” “no one has proven this book wrong” and “try and disprove this book.” (Well, I guess that settles it!) Finally, Coulter has also received “dildo” (elsewhere applied mostly to Bill O’Reilly books*), “vibrator,” “lunesta” and “xanax.” It seems the naughty teenagers and the pharmaceutical spammers have discovered Amazon tags!**

Tag-spam on Amazon

Amazon’s all-items tag cloud shows the impact of partisan tagging. After “DVD,” “Music” and “fiction,” the largest tag is “Defectivebydesign,” applied by a small, pitchfork-weilding mob of Microsoft DRM haters.***

Ultimately, I don’t care about the commercial side of things, but “opinion tagging” in a low-numbers environment holds commercial risks. The Summer of my German Soldier is actually a pretty good book (I hear). Although Amazon won’t let me confirm it, one suspects all five negative tags come from one user. Is it fair to let one anonymous reader shape a book’s tag cloud so completely?

How to make Ecommerce Tagging Work

Big suggestions:

  • Figure out why your customers would want to tag your stuff. Don’t fool yourself.
  • Make tagging as easy as possible. (Amazon’s are quite easy to add, although registration is a pain.)
  • Understand that commercial tagging can turn people off. Avoid crass commercialism. Respect your taggers—these people are helping you out!
  • Make taggers feel like it’s “their” thing. Encourage users to give out their tag URLs—people love to show off—and let them export their tags any way they want.
  • Keep tagging social. Stop selling and start connecting. If you connect people up right, the selling will follow. Think Tupperware!
  • Consider whether a non-commerce site has the data you need. Back when LibraryThing had a million tags, Amazon could have bought our data for the price of a cup of coffee. Now, that we’re big and important and have three employees, that’ll be THREE cups of coffee, buster!

Small suggestions:

  • Put methods in place to fight spamming and tag-bombing. LibraryThing does this by considering both the number of times a tag has been applied and the number of users who use it. A single angry user can’t make a tag really big on the tag cloud.
  • Have logical URLs. Amazons tag URLs are full of junk, much of it rather crass attempts at search engine optimization (eg., the book title is inserted into the tag URL, but it works without it). It seems getting a little search engine help trumped providing users with easy-to-remember URLs.

Methods

To my knowledge, Amazon doesn’t release any total tag statistics. So I tried a statistical sample:

  • I picked 1000 random entries from LibraryThing libraries, and retrieved their ISBNs.
  • I ran the ISBNs through LibraryThing and Amazon, counting tag numbers. I did it by hand through 100 before I decided to write a quick scraper.
  • I compiled the results and did some simple math. You can find my Excel file to the right.

The final results were 56,185 tags on LibraryThing, 5,528 on Amazon. Extrapolating on the sample, I conclude that Amazon has something like 1,337,388 tags in total, to LibraryThing’s 13,593,069.

If anyone wants to duplicate the test, let me know. By default, LibraryThing doesn’t think of tags ISBN-by-ISBN, so I’d need to give you an API to that data.

Problems with my method

  • It only covers books. Maybe DVD tagging is a different phenomenon. I note, however, that Amazon’s page for bananas—yes, Amazon sells bananas—is overrun with Borat-themed tags.
  • The random books were drawn from LibraryThing. Maybe LibraryThing’s ISBNs are unrepresentative of Amazon’s ISBNs as a whole—that the sort of books that are tagged on LibraryThing are not tagged on Amazon. There may be some truth to this insofar as LibraryThing includes a lot of older books, while Amazon focuses on the new and in-print.
  • I only sampled Amazon’s US site. LibraryThing has a fair number of non-US editions.

Let me know what you think

As usual, I’m dying to hear what people think about this post. I know it’s imperfect—I bit off more than I could chew. But it says a lot of things I’ve been keeping in my head for months. Leave comments here. If enough interest develops, we can start something on Talk.

*Shouldn’t it be “falafel”? And YES! O’Reilly’s Culture Warrior IS tagged “falafel”! I swear I did NOT do it.
**Out of 60 unique tags applied to the book, I can spy only four that read like subject tags.
***Small numbers also mean Amazon is open to manipulation. One of the larger tags on their tag cloud is for “bards and minstrels,” applied to 4,200 products by six taggers. The tag has never been used on LibraryThing, Flickr or Del.icio.us. I suspect a conspiracy.

Labels: Uncategorized

Monday, February 19th, 2007

Book pile bonanza winners

We had an amazing amount of fantastically great entries for the 10 million books / valentines / presidents book pile contest. That’s a lot of superlatives, I know, but trust me, it’s worth it. There were Valentine’s Day themed piles, President’s Day themed piles, and a whole bunch of people took us up on the “best damn book pile ever”. (See the entries here, under the Flickr tag “LibraryThing10mil“).

Several themes emerged, and not just Valentines and Presidents. We had several variations on a heart, showing love from Harlequin to Latin. There was fun with numbers and words, sweet stories, and a surprising number of floating shelves (I’m covetous). Overall, a fantastic collection of book piles. You can see them all here (and several that hadn’t posted to Flickr yet are linked to in these blog comments).

Without further ado, our grand prize winner. madinkbeard’s “We heart LibraryThing” was a stand-out. As an added bonus, you can see all 86 books that were used to create this red spine-d heart here. Madinkbeard will be getting one hundred dollars—to be spent entirely on one book.


We also have five runners up, who will each get a year’s gift membership to LibraryThing.*

Runner up. First, I loved parelle’s “Bookpiles and my love life”, which chronicles a relationship, from a bookstore meeting. The story starts here and is continued here.

Runner up. “Presidents”, this wall of presidents by Pesky Library was beautiful, and, I think, entered by a small library?

Runner up. jadelennox’s “Ten million books and counting” was one of our number themed entries—starting low and going up high. From zero to pi to infinity (and beyond?)!

Runner up. “Books are love!” by j2.0 brings book piling to a whole new level (and also gives us our first nude photo blog post)!

Runner up. And lastly, “Never Enough Time for Reading”, by Munzerr solves the all important question of having a “good body:books ratio” (as mentioned in the photo’s comments).

There are a couple of other photos that we’d like to highlight (let’s call them the the honorable mentions, for lack of a better term).

Honorable mention. Narrisch’sBabel in Translation” was fantastic (and extra points for using the phrase “book-a-ganza”).

Honorable mention. skullfaced’s Skull stacking managed to combine natural history with Valentine’ s Day and Darwin’s Birthday, and all on a bathroom floor!

Honorable mention. Kristy’s “Pursuit of Love Bookpile” I had picked as a winner, until I realized that she was John’s girlfriend, and that that probably meant she couldn’t win an actual prize.**

Honorable mention. Lastly, we had a soft spot for the books in the truck, nicely packed into their “car seats”.

*Will the winners please email me (abby@librarything.com) so I can send the prizes your way?
**Even though we never explicitly stated it, I think that wives, girlfriends, and children of LT employees, though welcome to submit photos, can’t actually win. Hey, we can make up rules as we go along, can’t we?

UPDATE For some reason, one of these photos that previously contained books now is showing up as a chicken salad (I think. The vegetarian in me isn’t quite sure). In the meantime, Flickr’s down page reports that they’re having a massage (I’m jealous).
UPDATE part deux Apparently Flickr is having a cache problem, leading to “weirdness” (a technical term). So you might see chicken salad, you might see a book pile. Enjoy either way.

Labels: book pile, contests

Monday, February 19th, 2007

WorldCat Registry: Join up!

OCLC has introduced WorldCat Registry, a one-stop place for libraries, library consortia, library vendors, funders and suchnot to put contact info, link URLs and other “identity” data. Every institution gets its own page—it’s like MySpace but with libraries and minus friends, comments, tacky background images and all the drunken photos. Here’s LibraryThing’s page. We hate being a “vendor,” but there was no category for “The OCLC of Lilliput.”

OCLC is being generous with the entry requirements. Personal libraries* are out, but small institutional ones are not. Their FAQs note “no restrictions prevent a smaller physical entity such as a church library, or a ‘virtual’ entity such as a digital library, from representing itself in the Registry.” So, if you’re on an institutional membership, go ahead and take them at their word—join up!

When you join up, you can give your catalog URL as:

http://www.librarything.com/catalog/YOURUSERNAME

Your ISBN and ISSN URLs are:

http://www.librarything.com/catalog.php?view=YOURUSERNAME
&searchbox=ISBNORISSN&searchType=Books

LibraryThing is not currently listed among their vendors. Until they do, select “other.”

*Which reminds me, we recently had an application by a coven. They were uncertain if they were a family or an institution. OCLC is silent on the coven issue.

Labels: Uncategorized

Saturday, February 17th, 2007

OttoBib links added

I’ve added links to OttoBib, a super-simple citation generator created by Jonathan Otto, an undergraduate at the University of Wisconsin at La Crosse.

The feature is available on the work info page (card catalog page) for any specific book—yours or someone else’s. Here’s an example. At present, it only works for books, not for general works. (After all, a work may have 1,000 ISBNs under it.) We hope to extend this in the near future. The results aren’t saved in any way, so if you’re doing a bibliography, you’ll have to do some cut-and-paste work.

We’re linking to OttoBib because we think it was nicely done. But, down the road, LibraryThing may need a stronger solution—one that works with non-ISBN books and which saves and juggles citations, rather than just creating them. We have some ideas along these lines, but your suggestions are always apprecated.

Labels: academic, citations, new feature, new features, ottobib

Friday, February 16th, 2007

Get your photos in!

There’s mere hours left in the 10 million books/valentines/presidents book pile contest. Get your entries in before midnight tonight (EST)! We’ve got 33 so far, and they’re looking good… So who wants that hundred dollar book?

UPDATE: If there’s any doubt that your stuff is awaiting clearance, post your URL in comments, or mail it to Abby. (Flickr doesn’t always post photos from new accounts to public tag pages right away, so if your submission doesn’t show up on this page, tell us!)

Labels: 1

Thursday, February 15th, 2007

Borges and women entrepreneurs

An alert LibraryThing member sent me Amazon email, a textbook case of collaborative filtering gone wrong. (LibraryThing makes these kinds of mistakes too.) The fact that it’s Borges, who has such fun things to say about how books relate to each other, is just icing on the cake.

Dear Amazon.com Customer,

We’ve noticed that customers who have expressed interest in Borges: A Life by Edwin Williamson have also ordered How She Does It: How Women Entrepreneurs Are Changing the Rules of Business Success by Margaret Heffernan. For this reason, you might like to know that Margaret Heffernan’s How She Does It: How Women Entrepreneurs Are Changing the Rules of Business Success is now available. You can order your copy for just $17.13 ($8.82 off the list price) by following the link below.

How She Does It: How Women Entrepreneurs Are Changing the Rules of Business Success How She Does It: How Women Entrepreneurs Are Changing the Rules of Business Success Margaret Heffernan

List Price: $25.95
Price: $17.13
You Save: $8.82 (34%)

Labels: Uncategorized

Tuesday, February 13th, 2007

Blyberg’s SOPAC

This isn’t breaking news at this point, but it’s still cool. John Blyberg has announced what he calls “SOPAC”:

It’s basically a set of social networking tools integrated into the AADL catalog. It gives users the ability to rate, review, comment-on, and tag items.

Tags, ratings, and reviews in an OPAC! I think it’s great that he’s done it— it’s no surprise that we’d love to put some of LT’s features into OPACs, and to see a big library like AADL take on social stuff legitimates the point.*

Anyway, you should check out his post about it, which even has a nifty screencast.

*Tim has blogged before about putting LT and OPACs… We’re thinking it’ll be a sort of OPAC widget, so hold onto your hats.

Labels: Uncategorized

Tuesday, February 13th, 2007

Introducing the book

Labels: Uncategorized

Monday, February 12th, 2007

Library of Congress Authority Files, Open!

So begins the PDF announcing and detailing a major new development for the library-data world. Simon Spero, library-geek extraordinaire, has released a nearly complete copy of the Library of Congress Authority Files.

Get them here:
http://www.ibiblio.org/fred2.0/authorities/

Simon assembled the files, available in MarcXML, by querying the Library of Congress’ Authorities website one-by-one over months. He’s a patient man.

As I’ve discussed before, Library of Congress data is both free and unfree. As a work of the US government, it cannot be copyrighted.* But the LC has traditionally restricted access, offering small amounts through public interfaces**, and selling larger amounts through its Cataloging Distribution Service. A small industry has developed where the CDS’s buyers resell it commercially. Until now, nobody has decided to just… let it go.

I anticipate that Simon’s action will draw some criticism. If the LC can’t make money selling its cataloging, how will it support this vital work? This sentiment will grow stronger when Casey Bisson releases the full LC Marc data, but whether for authorities or other cataloging data I think this is short sighted.

As I see it, the failure of the LC and other libraries to get their data “out there” on the open web has hurt them far more deeply than their catalog sales could ever recoup. It has made them seem irrelevant, standing silent and apart from the great conversation, which grows more interesting with each passing year.

The first culprits are the online catalogs***, ugly, backward things lamed with session-based URLs. If you want to link to the LC, you can’t. The URL you get will only work for you, for ten minutes. Linking–the very soul of the Web–is impossible.

The second culprit is how libraries have distributed the data itself. Amazon makes its book data accessible to all in a handy, universally-understood XML format. It’s so easy and appealing, over 140,000 developers have signed up to receive it. Libraries by contrast generally make their data available—if they make it available—over a tricky and obscure protocol know as z39.50. And the data itself is in MARC, a rich but impenetrable spectrum of formats—eg., DanMARC, the Danish MARC format!—used by and largely only understood by librarians.

With wretched web sites and unretrievable, unparseable data, libraries have lost vital ground. If the world worked right, Googling a book should turn up a library within the first few results. But libraries seldom make the top 100, and despite being the largest library on the planet and producing the lion’s share of original cataloging, the Library of Congress is completely absent. In its place are Amazon, its peers and sites that use Amazon data.**** Libraries may know a lot, but simplicity, attractiveness and ubiquitous data have won out.

It’s time to fight back. Libraries and library data can change the book web for the better. Three cheers to Simon for making a critical first step. Viva La Revolución, my brother.*****

*The LC reserves the right to copyright it outside of the United States. It’s unclear if they ever have.
**In LibraryThing’s case, through a z39.50 connection. Although the limits are not clearly specified, we’ve been given to understand that large-scale mining will not be tolerated.
***What library-techs called OPACs—Online Public Access Catalog. The fact that someone still needs to to add “Public Access” to “Online” is the problem in miniature. Does Google call itself a Public Access Search Engine?
****Don’t get me wrong; Amazon is a great site, and should be up in the top results too.
*****In so far as both Simon and I blogged the death of Milton Friedman, I suspect we’re equally uneasy with revolutionary Spanish.

Labels: Uncategorized

Saturday, February 10th, 2007

How much do you want to pay?

We were inspired by something John Buckman is doing at the online record label Magnatune. When you buy a CD, Magnatune asks “How much do you want to pay?” and gives you a price menu. You can’t pay nothing, but you get some latitude. You can low-ball them a bit, or, if you’re feeling grateful, pay more.

It sounded like a fun idea to us. We’ve had people—and not a few—pay twice to thank us. But we’ve also had emails from people who say they’ll buy a membership next time they get their pay check, disability, etc. That kills us, so we’ve given out a lot of “pending” membership.*

The “typical” amount is the old fixed amount: $10 for a year’s membership, $25 for a lifetime membership. I’m dying to find out if we take a bath, break even or pick up a few extra bucks. Anyway, we’re going to try it out until Valentine’s day at least. (Speaking of which, don’t forget the ten-million book/Valentine’s day/President’s day bookpile contest.)

We learned about Magnatune’s idea watching a Japanese piece on John Buckman on YouTube. Buckman was in Japan to speak at the New Context Conference 2006. The guy in the cowboy hat interview him in English, added highlights from his talk, and explanatory wrapper in Japanese. (Here he explains the pricing idea, but those are the only English words, so I have no idea what he says about it.)

Buckman is, of course, also the founder of BookMooch, the largest book-swapping site out there.** LibraryThing and Bookmooch have warm relations—lots of shared users and mutual linking—and I’ve spoken to Buckman a few times. He “gets it,” so we’re happy to borrow an idea from him.

* My favorite “pending” account was the U. S. diplomat in central Asia, who wondered if she could pay us when she rotated back and we couldn’t offered to send us a check via diplomatic pouch. I really want to send a CueCat via diplomatic pouch!
** Judged by Alexa traffic (28,185 vs. 31,032 for PBS on 2/10), not that Alexa means much. PaperbackSwap has been around longer, and may have more members, but it’s a walled garden and, we think, not going anywhere until it opens up.

Labels: 1

Thursday, February 8th, 2007

THE ten millionth book

At long last (and after some intensive database searching), we are pleased to present LibraryThing’s ten millionth book. Drumroll please…

The city in which I love you: poems, by Li-Young Lee was added just after noon last Saturday, by user vinodv.* We’re giving Vinod a lifetime account for this honor. According to his profile, Vinod is in Cambridge, MA—Tim’s hometown, and just across the river from Abby. Hey, we could be hand-delivering a CueCat to go with that membership!**

The celebration continues though—get your entries in for the biggest baddest book pile bonanza ever.

*This was also apparently the very first book Vindod added to his catalog. Quel distinction!
**Only half joking, I think.

Labels: 1

Thursday, February 8th, 2007

Web 2.0 Video

Unless you’ve been on Mars, you have seen this. Chris Anderson put it best: “This is why I do what I do.”

Labels: Uncategorized

Wednesday, February 7th, 2007

Ten million books and contest extravaganza

In honor of hitting the big 10 million book mark this week, we’re having a book pile contest bonanza. We’re combining three contests into one here—Valentine’s Day, President’s Day, and of course, ten million books.

The challenge. Start taking pictures. Your book piles can be love themed, president themed, or just the coolest damn book photo you can create.

The prizes. We’ll pick five winners, who will each receive a year’s membership to LibraryThing. The grand prize winner will receive a $100 gift certificate. The catch? That hundred dollars must be spent entirely on one book. So start looking around Abe’s Rare Book Room…* (Amazon is fine too.)

The rules. Post your photos to Flickr, as usual. Tag them “LibraryThing10mil“.**

The deadline. The contest ends on February 16th, at midnight, EST. We’ll announce all the winners on Monday, February 19th.

*This is harder than you might think. A signed and first edition copy of The Little Prince sold for $10,450 last year, and that was only the 7th most expensive book sold in 2006 on Abe. Sadly, we will not be buying you this $55,471 copy of The Tale of Peter Rabbit. So if you had the $100, what one book would you try to get?
**Users who already posted Valentine’s or Presidents photos to Flickr after I said this, can you change and/or add the tag LibraryThing10mil? Sorry and thanks.

Labels: 1

Tuesday, February 6th, 2007

Better work combining

Until now, work combination was an author thing, if two works didn’t share authors they couldn’t be combined. This is good enough most of the time. But some works have multiple authors with different ones taking the “main author” spot in different catalogs. And it didn’t work with authorless works.

For now, you can’t combine any work, but only ones that share an ISBN. The list of potential combinations is available on each work’s “book information” page (), at the bottom of the page. If it proves useful and popular, I may move it.

Here’s a good example—three editions of (multi-author) Cluetrain Manifesto that weren’t combined with the main one:

But not every suggestion is good. Here’s The Rule of Four. I have no idea what that Babichev book is doing there. It might be member error, a source error, a publisher reusing ISBNs or a rogue publishing reusing a known number instead of paying for a new one. Anyway, I suggest you don’t combine it!

Unfortunately, this doesn’t fix authors generally. The Cluetrain Manifesto is still listed under a single “main” author. We hope to change that soon.

Labels: 1

Tuesday, February 6th, 2007

Never the Twain shall meet, um, Gibbon

Frustrated that Terry Prattchet and Neil Gaiman keep getting combined? Unforunately, the system makes a few bad combination suggestions, and now and then somebody takes it up on them. To solve this I’ve added a feature to the author pages:

I kicked things off by permanently divorcing Edward Gibbon from Mark Twain (!). But I’ll let you guys tackle Gaiman. I’ve deputized the Combiners group (which, in the best LT tradition, sprang into being spontaneously) as the place to fight out whether Jack London and Emile Zola are really the same author.

More changes along these lines soon, including visible logs for combination action.

PS: I also cut down on the number of “Also known as” names visible, unless you click “see complete list.” Nabokov was getting absurd…

Labels: 1

Tuesday, February 6th, 2007

10 million books and 303 LT Authors

In continuation with our celebration of 10 million books, today we’ve also hit 303 LibraryThing Authors.* Sara Ryan / sararyan just became our three hundred and third LibraryThing Author. The best part? Sara’s also a librarian! Her first book, Empress of the World was excellent, so watch for her second novel, The Rules for Hearts, which comes out in April.

Keep watching for more of the 10 million books celebration blogging!

*I must not have been paying attention when number 300 must have breezed by me, but it’s always good to celebrate a palindrome, I say. Who doesn’t love a palindrome?

Labels: 1

Monday, February 5th, 2007

Can subjects be relevancy ranked?

I wrote this up on the plane from San Francisco. (I was there on a secret, unbloggable mission!*) It’s a bit involved and it doesn’t “arrive” anywhere, but, if you’re interested in subjects and relevancy ranking, it might be worth thinking about.

There are a couple differences between user tagging (“free tagging,” “social tagging,” etc.) and traditional library classification. “Who does it?” is the most obvious difference, followed by whether or not the labeling action takes place within a predefined ontology, or is made up on the fly.

It’s easy to ignore a third, and very critical difference. Subject classifications, like the Library of Congress Subject Headings (LCSH), are essentially binary. It’s non-overlapping buckets. Something either does or does no belong in a subject. There are no gradations of belonging.

The idea is, as Clay Shirky and David Weinberger have reminded us, rooted in the physical world. Subject classification escapes the physicality of shelf-order classification, in which a book must be shelved in a single place, but is still restrained by the physicality of the catalog card. A catalog card can only reference a certain number of subjects. Nobody wants a book to take up twenty cards. And the subject cards can only reference so many books. About 90% of all literature could fall under the LCSH subject Man-woman relationships. But it would make no sense to slot this 90% under that heading in a physical card catalog–the card catalog would instantly grow by 90%! And there seem to be very real differences in relevancy and “what-the-heck”-ness between real-life members of the “Man-woman relationships” LCSH: High Fidelity, Great expectations, The Fountainhead, I Kissed Dating Goodbye, and The Official Hottie Hunting Guide.

If you’re very selective, you can keep the numbers down. But, apart from the rule that the first subject is generally the primary one, there’s no good way to relevancy rank the books belonging to a subject.

Tags can do it, because tens, hundreds or thousands of users applying tags creates a “statistics of meaning.” So, 1984 is tagged dytopia 549 times, torture six times and Great Britain two times. The numbers can be turned into ranking, so 1984 shows up high on a list of books about “dystopia,” lower under “torture” and near the end of a list of books about Great Britain.

This is all well-worn territory. My question is this: Is there any way to relevancy-rank books within subjects?

I was reminded of the question when checking out OCLC’s new project, FictionFinder. I’ll blog about the whole later, but for now know that you can search for a LCSH subject and get back a list of books belonging to it. (I can’t link to the results, which are session based.**) Check out the LCSH “City and Town Life” and the top book is Red Badge of Courage. Lacking a better method, FictionFinder let popularity (the number of OCLC libraries with a copy) stand in for relevance. LibraryThing does the same, using our popularity numbers instead. The results are not systemmatically better (in this case Ulysses wins).

I tried two solutions:

The first was to tie into LibraryThing’s tags. So, figure out what tags are most characteristic of books with the subject “Man-Woman Relationships,” and then use the presence and number of these tags to rank the subject results. So, for example, “Man-Woman Relationships” has a global correlation with “relationships,” “dating” and “romance,” none of which are very prominent among the tags applied to Great Expectations, so it can fall low on the list.

I got far enough down this road to know it was going to help.

The second and more interesting algorithm was to see if books can be ranked within subjects without any other information. This would help OCLC, who are unlikely to pay for LibraryThing data, and to any library that employs LCSH, most of which would have no “popularity” data to use either.

I hit upon the idea that subjects “reinforce” each other, and that this must leave a statistical signature. For example, it seems that “Love stories” and “Psychological fiction” are commonly applied to books about “Man-Woman Relationships,” but that “Androgynous robot alone on an island — Stories” is not. (Okay, that’s not real, but the point stands.) Can these “related subjects” relevancy rank the subject itself?

I wish so, but I can’t get it to work well enough. It works for some topics, but falls down for others, laughably.

Some ideas I’ve considered:

  • Treating subjects as links, and running some sort of “page-rank” style connection algorithm against them. Maybe this would bring out coincidences that simple statistics misses.
  • Using other library data, such as LCC and Dewey. This would be reminiscent of how I made LibraryThing’s LCSH/LCC/Dewey recommendations.
  • Doing statistics on other fields, such as the title. So, for example, there’s probably a statistical correlation between “Man-woman relationships” and books with “dating,” “men and women” and “proposal” in the title.

None strike me as the silver bullet.

Anyway, my plane has landed–allowing me to do real work again–so I end in aporia. Ideas?

*I’m itching to blog it, but I have to hold off for now. I’ll throw some pictures up soon, however. I’d never been to San Francisco before. What a wonderful wonderful town.
**One can understand why OPACs made in 1996 are session based. How frustrating to see a new product with them.

Labels: Uncategorized

Monday, February 5th, 2007

Ten million books!

On Saturday LibraryThing acquired its ten millionth book. Ten million is a bunch. Ten million means something. LibraryThing is no longer a “worthless jumble” of books and tags. It’s, it’s…

  • A meaningful jumble of books and tags
  • A hook to hang a bunch of pretty charts and graphs
  • An excuse for a book pile contest
  • A cause for celebration, and a party
  • A special cause for celebration for the guy who added number 10,000,000
  • An occasion to lay out future plans and goals

And probably a few other things. Anyway, it’s big enough that it won’t fit in one blog post, and with everything we have to do and all the vinho verde we need to drink this week, I’m expecting ten-million blog posts to drag on for days.

A meaningful jumble of books and tags. Ten million books translates into a piles of data, and piles of data means fun with statistics. And we’ve been having fun.

Today I added a new “combined” recommendation list. It draws on LibraryThing’s five existing recommendation algorithms to come up with a “best” list. I’ve replaced the longer list of recommendations on the work pages, wth a link to the Suggester page, where you can see all the lists.* (I’d be interested to hear if people appreciate the simplification or still want the full lists on the work pages.) Combined recommendations are available for 230,000 works. Because of variable work popularity, this amounts to recommendations for 72% of all the books in people’s libraries.

Alongside books, LibraryThing’s tags have also been growing. Although we’ve rarely celebrated milestones, tags are the untold story of LibraryThing. LibraryThing members have added thirteen million of them–an unprecented web of meaning in the book world. Check out a tag like chick lit, cyberpunk or paranormal romance and tell me what you think. I think LibraryThing members have arrived at something close to the paradigmatic reading list for these hard-to-pin-down genres.

On the subjet of tags, I recently did a statistical sample of Amazon’s book tagging. I estimate that since November 2005, Amazon customers have added about one-million book tags. When LibraryThing, a niche site, collects 13 times as many book tags as Amazon, one of the top-ten most visited sites, something is up. I’ll blog about it soon, but I think the basic answer is clear. Letting people tag “their stuff” works like gangbusters. Asking customers to tag “your stuff” doesn’t. People make their beds every day. But nobody goes down to the local Sheraton to fluff their pillows.

*Not quite. There are actually ten recommendation lists at play since, when the recommendations are sparse, we factor in a “flip-around” of the recommender-recommended relationship.

Labels: 1