Archive for the ‘tags’ Category

Thursday, February 5th, 2009

Random tags for scholars

Someone asked me to come up with a page of truly random tags for an academic project that needed to assess typical tagging. It might prove interesting to other students and scholars doing projects on LibraryThing.

Here’s the page.

It’s an HTML page, not an XML feed or other such format. Techies will scoff, but I’ve been asked for a lot of data like this, particularly from MLS students. The people who can easily parse XML in programming languages are not generally writing graduate school papers.

Labels: folksonomy, tags

Sunday, January 27th, 2008

Tagging: People-Powered Metadata for the Social Web


“Walk into the public library in Danbury, Connecticut, and you’ll find the usual shelves stacked with books, organized into neat rows. Works of fiction are grouped alphabetically by the author’s last name. Nonfiction titles are placed into their propper Dewey Decimal categories just like they are at tens of thousands of other libraries in North America.

But visit the Danbury Library’s online catalog, and you’ll find something rather unlike a typical library.

“A search for The Catcher in the Rye bring sup not just a call number but also a list of related books and tags—keywords such as “adolescence,” “angst,” “coming of age,” and “New York”—that describe J. D. Salinger’s classic novel … Click the tag “angst,” and you’ll find a list of angsty titles such as The Bell Jar, The Stranger, and The Virgin Suicides.”

So begins Gene Smith’s newly released book Tagging: People-Powered Metadata for the Social Web (New Riders). That’s right. The first book dedicated to tagging begins with LibraryThing—specifically our LibraryThing for Libraries project!

Library 2.0 people pause a second. How about that: a book about new developments in social media starts by talking about new things going on in a library? Not a social networking site, not a photo sharing site. A dream come true.

That’s all I have to say for now. I knew the book was coming; Gene interviewed me for it (selections on page 134). But I haven’t finished it yet.

My first impression is that it’s rich and detailed, covering everything from what tagging is and why it matters, to how to implement it at the level of user interface and even technically. But But, as is my wont, I’m already scribbling little objections and expansions in the margins. That’s the sign of a good book, right?

I’ve created a discussion group on Talk for people reading the book. Come join me to talk about it.

Labels: gene smith, librarything, librarything for libraries, social media, tagging, tags

Saturday, November 10th, 2007

An academic take on LibraryThing tags

I just discovered Tiffany Smith’s “Cataloging and You: Measuring the Efficacy of a Folksonomy for Subject Analysis“.* It’s the first detailed academic study of LibraryThing tagging—and a very sympathetic one.

The article focus on five books, comparing their tags with their Library of Congress Subject Headings (LCSH). The books are Harry Potter and the Half-Blood Prince, Susanna Clarke’s Jonathan Strange and Mr. Norrell, Ian McEwan’s Atonement, Marjane Satrapi’s Persepolis and John Hodgman’s The Areas of My Expertise.

LibraryThing doesn’t “win” every comparison, but it comes out pretty well. I’ve already coopted her observations on two titles into my talks, namely Persepolis and Areas of my Expertise, both of which rate a single, very general subject. On the latter:

“How do you identify the subject of a fictionalized almanac, which, according to the Library Journal blurb on the back cover, is ‘a handy desk reference for those needing a dose of nonsense’? If you’re the Library of Congress, you call it ‘American wit and humor’, and move on to the next item on your book cart. You’d be accurate, because Hodgman is American and the book is witty and humorous, but you wouldn’t have captured the specificity of this item.”

Smith contrasts this with the LibraryThing’s florid tag cloud, sporting such terms as almanac, hoboes, alchemy, cheese, cryptozoology, eels, omens, portents and absurdities. Record-by-record these tags may only serve to amuse, but if you can’t recall the title, Hodgman’s strange work can be easily retrieved by looking for books tagged both “eels” and “humor” or “hoboes” and “almanac”. By contrast, I would not recommend wading through the American Wit and Humor subject!

I was also gratified to see the author notice an effect I’ve mentioned periodically but which has found no echo in other examinations of the topic and in the whole tired expert-vs-amateur polemic. As she writes, LibraryThing members pick up on the Napoleonic Wars element in Jonathan Strange, which LCSH misses:

“This may speak to the problem of the physical impossibility of the library cataloger reading the entirety of this roughly 800 page book to get to all of the detail. The Napoleonic element is not evident for the first third of the book and is not represented in the chapter titles, although it plays a pivotal role in the plot development.”

Fundamentally, I’m willing to concede the virtues of expertise, but there’s a lot to be said for reading the book all the way through, and library catalogers are not often able to do that.

In this connection, I’ve previously noted how my wife’s third novel, Love in the Asylum, acquired an erroneous “Alcoholism” subject, derived ultimately from bad publisher flap copy. Clearly neither the librarian nor the publicist had read the book. (My wife caught the copy before it went to print, but not before it had acquired Cataloging in Print LCSHs.) And the LCSH team also missed the topic of American Indians (Abenakis), a major presence in the book, but not touched on in the first 1/3 or the flap copy.

Anyway, it’s an interesting read. Since Smith did her research LibraryThing has grown almost 100%, and there are few things I’d quibble with*, but it’s a very good outside examination of why LibraryThing member’s tags should be dismissed by librarians interested in cataloging quality.


*”in”—as they say in academia—Lussky, Joan, Eds. Proceedings 18th Workshop of the American Society for Information Science and Technology Special Interest Group in Classification Research, Milwaukee, Wisconsin.
**For example, Smith was confused why some LibraryThing works had subjects that were not present in the Library of Congress record, which she believes is our source. In fact, we get our Library of Congress Subject Headings (LCSH) from many librares. Libraries are free to augement the LC’s headings, and many do; we pick up anything in the 600s of all the MARC records that make up a work.

Labels: academics, LCSH, LIS, tagging, tags

Sunday, September 23rd, 2007

Tagging innovations, from the government

Has anyone seen click-based tag clouds? These are tag clouds in which the size of the words depend not on the number of times something has been tagged, but on the number of times the tag is clicked.

I never had, but Abby just spotted on the website of the the State of Delaware. Apparently site visitors are interested in employment.

It’s a pretty cool idea, and one I’d love to try out on LibraryThing. It wouldn’t work on work pages, but it might on the home page. And I’m impressed that it was on state-government site. While these sites are increasingly competent, they’re not usually thought of as a hotbeds of web innovation.

Labels: tagging, tags

Sunday, July 1st, 2007

Tags and the Power of Suggestion

REMINDER: LibraryThing is offering $1,000 worth of books if you find us an employee!

As usually argued, tags have “low cognitive cost,” a high-cognitive cost way of saying “you dash them off.” You grab the book, you tag it “cooking” and move on.

That usually a good thing. If you thought about it, you might try to come up with the “perfect” phrase, like “food preparation,” to cover salad-making and other methods that involve no actual cooking, or “food preparation, presentation and related subjects” to cover that book about creating beautiful designs in coffee foam and the manual that came with the Salad Shooter. But coming up with the perfect phrase takes effort and time. You pay for it then and, more importantly, you pay for it when you come to search–for searching is even more about low cognitive effort than tagging.

This much is standard. It’s also clear that “dashed-off” terms cluster well socially. For most domains there are only a few simple terms (eg., cooking, cook books), but an almost endless number of complex ones.

There are problems with this. Indeed, all the “problems” with tagging stem from it. A careful, formal system would distinguish between books about “leatherworking” and books of “leather erotica”. On LibraryThing, both tend to get tagged leather. I won’t multiply examples I’ve discussed before, so I can get to a new one: the Power of Suggestion.

Yak, yak, yak, yak. Joke, joke, joke, joke! Now, what is the white of an egg called? Did you think “yolk”? I’ll bet you did. The children’s joke illustrates something about the brain works. Rapid thought is open to the power of suggestion.

Now catalog and tag the book 9-11 by Noam Chomsky. I’ll bet you tag it “9-11.” The same goes for 9-11 emergency relief, 9-11 : artists respond and 9-11 : the world’s finest comic book writers and artists tell stories to remember. But elsewhere, “9/11” (with a slash) is by far the dominant tag.

All books
9/11 1179 times
9-11 173 times (13%)

Books with “9-11″ in the title
9/11 28 times
9-11 32 times (53%)

Sometimes seeming synonyms actually encode a difference in nuance or perspective (eg., Shirky’s example of “film” vs. “cinema”). In this case, they don’t. There doesn’t appear to be any real difference between “9-11″ and “9/11″ that can’t be explained by the tile. This is why LibraryThing users have “combined” the two tags, an operation we allow, and the combination has not been contested.

Titles influence how we tag things. Most of the books on birds and birding could be tagged with either term, but books with “birds” in the title rank higher on the “birds” tag.

Or take Heilbroner’s The Worldly Philosophers. My brother, Oakes, once pointed out, Helbroner’s book about the history of economics is almost invariably to be found in a used bookstore’s “Philosophy” section, not in “Economics.”* On LibraryThing the problem isn’t so acute, but it’s there–152 people have tagged it “economics,” 75 have tagged it “philosophy,” the second-largest tag. Of course, there is some legitimate cross-over between the two subjects. But I don’t think the content alone would merit so much “philosophy” tagging.

This isn’t a perfect example either. It would be interesting to know how many of the “philosophy” taggers had read the book, or what their other tags for it were. But I think it shows a pervasive effect.

The “Power of Suggestion” isn’t a major problem with tagging. But in showing us a flaw, it clues us in to what it’s all about.


*He showed me this when I was quite young, and it stuck. So when I’m in a new bookstore and passing the philosophy section, I often do a quick check to see if my old, confused friend is there again. I’m weird.

Labels: cognitive cost, tagging, tags