Monday, August 20th, 2007

Introducing Casey Durfee (and the new search)

We just hired Casey Durfee (LibraryThing: caseydurfee), a crackerjack “Library 2.0” hacker. Casey’s first project—faster, better search—debuts today (see below). It’s a big win. We have a long queue of similar projects. Casey will also be heading up LibraryThing for Libraries, our project to get our data into library catalogs.

Casey will be working from Seattle, where he was recently left a job at the Seattle Public Library, managing their library system. Before that Casey worked at SirsiDynix*, so he has a lot of background in the arcane world of library systems—just what we needed for LibraryThing for Libraries. Casey is also responsible for L2, a handy Greasemonkey plug-in that adds Amazon content to library catalogs.

Perhaps the coolest thing Casey has done is a project called Helios, a “faceted” search on the Library of Congress records, done very simply and entirely with open-source software. It was a personal project, and although it’s not a complete solution, it searches the LC better than the LC’s million-dollar catalog. You can see Casey in action, talking about Helios, in this Code4Lib talk on Google Video.

Search. Check out the new work and author searches. They’re based on Solr, a simple but powerful search engine, and the same one Casey used on the LC data. Until now, we were relying on MySQLs fulltext capabilities. We had outgrown it, and slow performance was causing frequent database glitches.

It’s fast, accurate and searches all titles, not just the “leading” (mostly English) ones. But, as with everything we do, it’s not “perfect.” Casey has set up a Talk post about it. He has a variety of knobs he can turn, and is looking for feedback. I’m convinced it’s overzealous on “stemming”—picking up “loves” and “lovely” for “love.” That it even does stemming is quite an improvement from our previous solution. Once we’ve got it working the way we like, we’ll also be adding it to touchstones and elsewhere on the site.

Other projects. Casey is a certified library programmer. (I just play one on TV.) He knows his MARC21 from his UNIMARC, and his “glyphs from his diacritics”.** As time goes by we hope he can work on things like:

  • Rewriting our library-data import, to get all the diacriticals right and squeeze more out of the MARC records.
  • Adding more libraries. We’ve been avoiding UNIMARC libraries (eg., Italy libraries) and, until recently, most SRU/SRW-based ones. We can do better. We have also finagled access to British Library data, so look for us to add that too.

LibraryThing for Libraries. Right now, we have more than 350 libraries asking us about LibraryThing for Libraries. Altay and I have been going through at a snail’s pace. Casey should be able to crank that up a bit. We also think his SirsiDynix experience will come in handy. He’s already written a handy LTFL export script for HIP. He’s well-known in the HIP ILS community, and should move us past our current success among Innovative Interfaces catalogs.

That’s it. Welcome on board, Casey!

*When it was Dynix.
**Whatever that means; suggested by Casey.

Labels: 1

0 Comments:

Leave a Reply