Archive for the ‘Uncategorized’ Category

Thursday, May 3rd, 2007

Combined blog feed available

I used Yahoo Pipes to make a combined feed for this blog and our Thingology blog. It was easy to do, and the result is pretty useful. The three feeds are as follows:

Unfortunately, if you’re already using it, you got this message twice. From now on, I’ll be less likely to cross-post.

I also edited the employee list on the right, to add Altay. He is the magic behind the LibraryThing for Libraries Javascript, but almost nobody’s seen that yet, so we’re waiting for his first user feature to give him a proper introduction.

Labels: Uncategorized

Wednesday, May 2nd, 2007

Take my wife, please

The Vermont College literary magazine Hunger Mountain is eBayin-off writing critiques by its professors and alumni. These include Bret Lott (wrote Jewel, an Oprah book), Poet Maxine Kumin and prize-winning and exceptionally talented novelist Lisa Carey, my wife. All proceeds go to support the magazine.

We should get find some librarians, booksellers and subject specialists to auction off LibraryThing library critiques…

Labels: Uncategorized

Wednesday, May 2nd, 2007

LibraryThing Tag Consortium

Since Abby mentioned our plan to for “tag consortium”—think OCLC, but for tag data, and without licensing restrictions, I get to show the logo I put up at Computers in Libraries:

Spent half the night trying to get the fonts right, instead of writing my talk…

PS: Abby relates that she saw an actual kookaburra, sitting—you guessed it—in a gum tree.

Labels: Uncategorized

Wednesday, May 2nd, 2007

Superpatron advice list

Ed “SuperPatron” Vielmetti has posted an excellent list of “Ten ways for superpatrons to help build better libraries,” for an upcoming talk. Blogs love lists, and this is a good one. (LibraryThing is number five.)

Labels: Uncategorized

Tuesday, May 1st, 2007

Kangaroos! (and the NLA)

I’m finally back from Australia and a whirlwind week of vacation in California, and I’m rearin’ to go!* I had a fantastic time speaking at the National Library of Australia’s Innovative Ideas Forum. It was a lively mix of speakers, including Susan Chun who talked about the steve.museum project—tagging art! Very cool. Canberra is a beautiful city, and the NLA bent over backwards with hospitality (the kangaroo picture I took in the backyard of NLA’s head of IT!). I’d go back in a heartbeat 🙂

My talk focused on tagging and the wiki-like aspects of LibraryThing.** It was an incredibly receptive group—I’d never seen so many librarians clamouring for tags! There was a serious push to get the “tag consortium” that we’ve talked about before up and running. Because, as Tim has mentioned many times before, tags are most useful when they’re available in large quantities. So we’re going to offer up APIs and widgets that will allow libraries to both add tagging to their site, and to access LibraryThing’s 16 million.***

The white building in the background is the National Library of Australia. I couldn’t resist the rainbow picture!

update: I forgot! I promised Fiona (who called me “the biggest geek there”) that I’d bring attention to the group she created right after the program, Aussie Librarians. So go join!

Now, to get through all the email piled up in my inbox…

*Advice for life: having a vacation back-to-back with a business trip is exhausting. Particularly when time-travel is involved—Tuesday the 17th never happened for me, but Saturday the 21st I lived through twice. One 4/21/07 I spent walking around Sydney harbor (thanks to a 10 hour flight delay), and then I repeated the day on the plane… I have a newfound respect for that Groundhog Day movie…
**Stop and think about how much of LibraryThing is done by its members (i.e., you). You add books, of course, but also tags, reviews, and ratings. You upload cover images of books, photos and pictures of authors. You combine tags, author names, and works. You add links and disambiguation notices to author pages. And you’ve translated the site into 30 different languagues. That, folks, is nothing short of incredible. The helpers page on the Zeitgeist, if you haven’t seen it, is an addictive chronicle of all this work.
***LibraryThing for Libraries is certainly a big step in adding tagging to library catalogs (we’ve got tag-based browsing widgets up and testing now, and are are working on widgets that will allow patrons to add tags).

Labels: Uncategorized

Monday, April 30th, 2007

LibraryThing for Libraries launches

We’ve launched the LibraryThing for Libraries demo site. After CIL we pushed everything back a week to work on speed, add fielded imports, and make some interface changes to the tag browser.*

Here’s the demo site: http://www.librarything.com/forlibraries/

So far we have about two dozen libraries and consortia interested enough to send us ISBNs. Over the next few days we’ll be getting back to people with directions on testing the service out.

Sad to say, but we’re still trying to figure out pricing. Here’s my thinking, which ends in aporia.

  • It seems right to tie the price to the number of ISBNs that LibraryThing can potentially enhance. For public libraries, this is about 50-70% of ISBNs. For academic libraries it’s more like 25-50%. So, my thought was to make it $.02 for the first 25,000 ISBNs, and $.01 after that. (The two levels try to get at the shape of interest in a given ISBN; it’s more valuable to enhance Harry Potter than some obscure book.)
  • So, a small city in New England (pop. 75,000), has 84,612 ISBNs. 57,312 (67%) are enhanceable by LibraryThing. That comes to $823/year. That seems like a very good deal.
  • Clearly a consortium needs to pay more than a single library with the same number of ISBNs. After all, the consortium will have multiple copies of the item spread around the various consortium members. But a consortium of fifty libraries won’t actually have fifty copies over every ISBN, and there ought to be some “bulk” savings for them anyway.
  • This lead me to charging consortia a multiple of the square root of the number of members. So, for example, a library with 284,742 enhanceable ISBNs would pay $3,097, and an identical consortium with 28 members would pay $3,097 x SQRT(28) = $16,390.
  • Then you have the “branch” problem. A large city signed up for a beta test. They have 270,002 ISBNs—$2,950. But they have some 30 branches, a population of 600,000, and a library budget of $30 million dollars! This doesn’t work.

So, I think I need to get total collection or circulation figures, and multiply them by the percentage of ISBNs we can enhance.

I wish we could expand our pay what you want program…

(Money photo courtesy Jessica Shannon on Flickr, under Attribution-ShareAlike 2.0)

*Among other things, we normalized ISBNs, moving from storing a 13-character string in every table that needed them, to storing a four-byte integer tied to a table that mapped the integers to ISBN. Normalizing textual data happens all the time here, but normalizing something already so compacy and inherently unique was force on us by the dawning realization that we’re going to be handling dozens or hundreds of millions of bibliographic records. So now LTFL tosses around arbitrary ISBN keys like mad, without ever knowing what ISBN they represent. O brave new world…

Labels: Uncategorized

Thursday, April 26th, 2007

Bugs, New York, Radio

Today was a full day—New York, radio, publishers and bug-fixing. In reverse order:

Bug fixing. I finally slew the bug that sent work copies off into la-la land. I also found why book-swap data was screwed up. It turns out Bookmooch’s data feed is now too large for PHP’s 40MB default memory space, and this was short-circuiting other feeds. Wow—way to go Bookmooch. I increased it to 80MB until I can rewrite it to load the data in pieces, and reloaded everything. I also fixed a matching algorithm, so that http://www.librarything.com/title/the_perfect_store goes to The Perfect Store, not The Great Gatsby. I’ll be working the rest of the night, except when I have to put my laptop through the x-ray.

Publishers. I gave a talk to the Association of American Publishers. Twenty minutes is too little time, starting from zero and trying to get to what’s “happening” with social software and books. But I think I got across the central message—(1) I’m crazy*, (2) LibraryThing is orders larger and more interesting than its competitors, (3) stop marketing at people and get in the conversation, (4) get involved with LibraryThing.

It certainly would be nice if the publishing world were as friendly to LibraryThing as the library world.

New York. I flew into JFK this morning (6am departure, ouch!). I was there on business, but, since I work all day long, I don’t feel guilty spending the afternoon at The Strand, diligently confirming they do, in fact, have eighteen miles of books. Today’s haul: Richard Westfall‘s The Life of Isaac Newton and Adam Cohen‘s The Perfect Store: Inside eBay.

Radio. I appeared on Public Radio International‘s Radio Open Source. They were doing a show on David Weinberger’s upcoming book Everything is Miscellaneous: The Power of the New Digital Disorder. I’ve blogged about David and his book before. To repeat: It’s excellent. Weinberger, a true Miscellaneous Man**, explores how digitization and mass-collaboration, -filtering and -classification (eg., tagging) are changing knowledge, and its relation to authority. After an introduction with David, host Christopher Lydon brought in super-librarian Karen Schneider, then me, to chime in on the topic.

I pointed out how tagging worked for tags like chick lit, queer, glbt and lgbt. I also tried to get at a nagging issue for me—does “knowledge” change, or do we just get new perspectives and ways of getting at it? I’m happy to see the realm of debate, uncertainty, personal choice and personal understanding expand—for us to “swim in the complex,” as David writes. But I won’t give up on a small, hard (Pluto-like?) core of truth. More on that later.

OpenSource streams at 7pm tonight. After that, the audio—direct or podcast—will be available here.
*I love explaining to people that LibraryThing has no advertising or funded promotions, and doesn’t push affiliate links, but is profitable. On a more personal note, it was unreal being back among “publishing types.” I never mentioned it, but I used to work at Houghton Mifflin. I felt at home in uncomfortableness, as it were.
** Who else has a PhD in Philosophy and wrote jokes for Woody Allen? He’s more varied than my junk drawer!

Labels: Uncategorized

Tuesday, April 17th, 2007

5¢/patron, $1/student

For a while now, libraries have been approaching us about whether LibraryThing would sell them bulk memberships—so all their patrons could, potentially, become members. Today at CIL two more people asked. Time to act.

From now on if a public library or a college or university wants to buy memberships for everyone in a community, it’s 5¢/patron, $1/student.

The math is easy. If a town wants to give out free LibraryThing memberships, and they have 20,000 patrons—defined as working library cards—they would pay $1,000. If a college or university want it, they pay $1 for every student, grad and undergrad—profs. and staff ride for free. The library gets a stack of membership cards, each with a unique code, good for a year’s membership from the date of activation.

Details:

  • Patron cards would have to be given out in person, not over the phone.
  • Student accounts would require email confirmation to a valid school email (like Facebook)
  • Communities may elect to set up a group. Members would get an automatic invite for that group.
  • We will work to make sure LibraryThing links to and collects data from the institution in question. The latter requires an open Z39.50 connection.

If interested, write tim@librarything.com.

Labels: Uncategorized

Sunday, April 15th, 2007

Tim to CIL and the Library of Congress, Abby to Australia

UPDATE: If you’re in DC and want to come to CIL (Librarian? Enjoy vendor-tchotchkes?), I have 50 free tickets. I’m supposed to give them out to my important vendors and clients. That’s you. Email me and we’ll figure out how to do this. I’ll probably leave a stack at the closest Starbucks.

I’m off to Computers in Libraries in Washington, DC. LibraryThing will have a booth there, and I’ve giving two talks:

  • Tuesday, 1:30-2:30. “Cutting Edge Leaders.”* One whole hour of me, giving my general talk about what LibraryThing is and what it means, amped-up for savy CILers.
  • Wednesday. 11:30-12:15. “Catalogs/OPACs for the Future,” with me and Roy Tennant. I’ll probably do LibraryThing for Libraries.

I’ll be showing LibraryThing for Libraries at the talks. Unfortunately, it’s just me, so I’m going to torn between manning the booth and going to all the great talks. I’m bringing along forty CueCat barcode readers. Free? No. I’ll be giving them out at cost—$5.

On Thursday I’m doing a talk at the Library of Congress. I am completely psyched. It’s not open to the public, but they said I could sneak in a friend or two.

Also on Thursday, Abby will be in Australia at the Innovative Ideas Forum, hosted by the National Library of Australia.**

On Friday*** I’ll be the closing keynote at Digital Odyssey 2007, hosted by the Faculty of Information Studies, University of Toronto. I’m talking about “Social Cataloging and the ‘Fun’ OPAC?” I put them in myself, but I want to remove those quotes.

*Apparently I am one, because it’s just me and I not planning to talk about the others.
**Synchronicity. We have tried and failed to find national libraries for the other LibraryThing employees to talk at on the same day. If you represent such a library, please contact us.
***Portland->Boston->DC->Toronto->DC->Portland. Gulp.

Labels: Uncategorized

Saturday, April 14th, 2007

LibraryThing for Libraries: How it works / The five-second rule

The LibraryThing for Libraries widgets have a unique architecture. You install it on your OPAC’s HTML pages, but the OPAC doesn’t “do anything.” All the work takes place in browser JavaScript requests to the LibraryThing for Libraries servers. Only when the patron clicks on a specific book does the library OPAC come into the picture again.

Your creaky OPAC can rest easy. All the database work and the statistical number-crunching that makes something like recommendations or tag browsing possible takes place elsewhere. You get beefy new functionality without a single extra OPAC request. (Of course, we think using a LibraryThing-enhanced catalog will be so fun—we don’t mean that ironically—that patrons will spend more time browsing them.)

*BUT* before LibraryThing can take the work off your hands, it needs to know what ISBNs you have. So we ask for an export with ISBN data, and accept any format your OPAC makes.* And if a link to a book is to display the same title and author given in your OPAC, it needs to get them. Exporting and uploading them is impracticable. There are dozens of possible formats to parse, and anything that complicates the export process will limit our potential user-base. LibraryThing for Libraries needs to be dirt-simple. It needs to be people-who-doesn’t-even-know-HTML simple.

So, LibraryThing for Libraries hits your OPAC to collect titles and authors, “screen scraping” the pages. The question is: How fast can it go?

Good question, and one we’ve struggled with. In search-engine industry, the standard maximum is one request/second. Google, Yahoo, AskJeeves, MSN (who?) and their peers use that as their benchmark, although you can request to speed them up or slow them down using standards like robots.txt. And they’ll do it all day long every day, and obviously without regard for how many others are hitting you too. In March LibraryThing was visited by 71 registered “bots.” The greediest, Google, hit us 11,338,467 times–an average of 4 times/second–and took almost 200GB. As our total bandwidth was 650GB, you can understand why Google sometimes seems a a bit, er, codependent.

Anyway, I wrongly believed that most OPACs could handle 1/second. After all, the libraries who’ve contacted us all have systems that cost hundreds or millions of dollars. And most have unspiderable “sessions,” so LibraryThing wouldn’t be competing with Google and its ilk.

Apparently I was wrong. Until Thursday, the requests were sporadic or round-robin-ed, so the effective time between requests was more than a second. Thursday afternoon we threaded the process, so they could run mostly continuously and concurrently. This morning I heard back that LibraryThing was taking too much from one OPAC, and slowing performance. Yipes! The system in question served a consortium of more than 25 libraries, so one can expect it isn’t the slowest, worst OPAC out there! We yanked the spidering. They took it well, even so. We owe them.

So, the new rule will be one request/five seconds max. And I’ll put in the rule of monitoring how fast it took the document to come in, and waiting a multiple of that, so any performance issue is adjusted for in real time. The LibraryThing for Libraries interface–not yet publicly available–allows libraries to speed up or slow down the process. “Slow” will reduce it to 10 seconds; “fast” will increase it to 2 seconds.

The new speed will mean longer waits before a library can see LibraryThing for Libraries in action. In our experience, we run about 50% coverage on US publics, so a 250,000-ISBN library will have 125,000 overlapping ISBNs and take a week for us to fetch all titles and authors. With almost three million ISBNs in LibraryThing already, we can show a library what the widgets will look like before, so long as they understand the titles may not match theirs exactly.

We thank the dozen libraries who are participating in our initial tests of the system. We think everyone is going to be impressed with the result. We got the tag-browsing widget working last night, and it’s absolutely fantastic. Altay, our JavaScript guru, is outdoing himself. And I celebrated with a big hunk of brie. I can’t wait to finish it up and show it off at CIL and the Library of Congress next week.

*This is possible because ISBNs aren’t just numbers, but numbers with structure. They are either ten digits (and maybe an X) long or thirteen digits starting with 978 or 979.** And the last digit is a checksum–a calculation based on the others. So ISBN 0747532699 is the first British edition of Harry Potter and Philosopher’s Stone, now selling for upwards of $1,000. But change a digit and you don’t get another book, but an error. The checksum won’t work. If anything bad slips through, running the ISBNs against LibraryThing’s books tosses them out.
**ie., ([0-9]{9}[0-9X]|(978|979)[0-9]{10}) in regular-expression land, where I live.

Labels: Uncategorized