Monday, January 8th, 2007

Books with similar library subjects and classifications

I’ve added a new and often powerful recommendation engine. It has a long and awkward name: Books with similar library subjects and classifications.* So far, I’ve only got it on Suggester pages.

It feeds off three pieces of “traditonal” library data:

  • Subjects (mostly Library of Congress Subject Headings),
  • Library of Congress Classifications (LCC), and
  • Dewey Decimal Classifications (DDC)

The recommendations are special in a few ways:

  • They can be very “targeted”
  • There is no “popularity” threshhold; books with just one copy in the system often have recommendations**, and it will recommend obscure stuff too
  • It works better for non-fiction than for fiction
  • It fails in interesting ways

At its core, the system looks for shared library data. So if book B has subject S, all the other books with subject S get a “vote”; the winners are the books that share the most subjects with the suggesting book. The algorithm goes beyond this by leveraging the inherent hierarchy of the three systems, apportioning successively “smaller” votes to ascending levels of the hierarchy. Popularity is also taken into consideration, but as little more than a tie-breaker.

At it’s best, the system is spooky. So Into Thin Air‘s other recommendations are spread over Everest, general mountaineering and adventure books. But the “Similar subjects and classifications” recommendations leads with Kenneth Kamler‘s Doctor on Everest : emergency medicine at the top of the world : a personal account including the 1996 disaster, a reasonably obscure (5 members) personal account of the same 1996 expedition. Other times the results are mixed or even odd. Kant’s Critique of Pure Reason pulls up commentaries on itself, but also the acclaimed but seemingly unrelated seminal work on the anthopology of magic, E. E. Evans-Pritchard’s Witchcraft, oracles, and magic among the Azande. Why? Because both receive the Library of Congress Subject Headings:

Strange bedfellows, perhaps.

*Got a better name? Let us know, seriously.
**Ironically, twice as many works have recommendations (219,000 vs. 120,000 for “people who have X also have Y”), but because they are more evenly distributed by work popularity, half as many books have recommendations (2.6 million vs. 5.9 million).

Labels: 1


Leave a Reply