Monday, September 17th, 2007

Google Book Search … on LibraryThing

Introducing something new we’re calling “Google Book Search Search.”

Google Book Search Search is a bookmarklet that searches Google Book Search for the titles in your LibraryThing library. It works not unlike the famous SETI@Home project. You set it up and searches Google Book Search slowly in the background.* You can watch, do something in another window or go out for coffee.

When it’s done you can link to and search all the books in your library that Google has scanned. You’ll find a “search this book” link on work pages, and a Google Book Search field to add to the list view in your catalog.

But this isn’t just a selfish thing. There’s a lot of searching to do, and you can help. If you choose, you can pitch in and help with others’ books. All of the data gathered is free and available to everyone. A lot of people want a reliable index of what Google has, not least libraries.

What do I do?

Google Book Search Search is a “bookmarklet.” You save it to your “favorites” or “bookmarks.” Then you got to Google Book Search and you click it. You can see what pops up on the right.*** Press start and it will start collecting information.

Here it is: Google Book Search Search

We’ve tested it on FF and Safari on the Mac, and FF and IE7 and IE5.5 on the PC. We haven’t tested it on PC IE6 yet. I have no idea about Opera.

Why a bookmarklet?

We’ve wanted to do this for a long time. But to link to a book on Google reliably you need its Google ID. For some reason Google doesn’t publish these, making it impossible to tell what they have and what they don’t, and impossible for sites like LibraryThing to send them the traffic they want. Secretive and self-defeating? Seems like it to me.

Efforts have been made to collect Google IDs before. The well-known Lib 2.0 blogger John Blyberg tried, as have others. We tried too. The trick is that Google Book Search—like the rest of Google—has a system in place to stop machine queries.**

Making a bookmarklet distributes the work. And because it takes place within a browser, it tends not to trigger machine-collection warnings.

Ultimately, however, Google can put a stop to this. The bookmarklet has a signature. And Google can send us a note, and we’ll disable the bookmarklets. Just as Google respects the robots.txt file, we’ll respect such a request.

Why not use “My Library”?

Last week Google introduced an interesting “My Library” feature, allowing people with Google accounts to list some of their books. A few tech bloggers saw an attack on LibraryThing.

LibraryThing members were quick to dismiss it. It wasn’t so much the lack of any social features, or of cataloging features as basic as sorting your books. It wasn’t even the privacy issues, although these gave many pause. It was the coverage.

Google just doesn’t have the sort of books that regular people have. Most of their books come from a handful of academic libraries, and academic libraries don’t have the same editions regular people have. Then there are the books publishers have explicitly removed from Google Book Search. Success rates of below 50% were common. Of these a high percentage are only “limited preview” or “no preview.”

The Google-kills-LibraryThing meme has another dimension. We WANT people to use Google Book Search. It’s a great tool. Being able to search your own books is useful, and LibraryThing members should be able to do it. Call us naive, but we aren’t going to be able to “pretend Google isn’t there.” And we aren’t convinced that Google is going to create the sort of robust cataloging and social networking features that LibraryThing has.

Our bookmarklet works by transcending ISBNs, using what LibraryThing knows about titles, authors and dates to fetch other editions of a work. In limited tests I’ve found it picks up around 90% of LibraryThing titles.

Information wants to be free

Our commitment to open data is long-standing. We’ve railed against OCLC for its desire to lock up book metadata.

But we’re not railing here. We think it’s perfectly fine for Google to control access to the scans it’s made. All we want to do is link to them, to send them traffic. It’s not clear to us that Google is trying to control access to its ID numbers.

You can see and edit the data here. Full XML downloads of the data are also available there.


*Come to think of it, it works like Google.
**The system is overzealous. It often refuses to show me Google Blog Search pages in Firefox because I look at LibraryThing’s blog coverage too much.
***It’s quite amazing what a bookmarklet can do. We could have never done it if Altay hadn’t shown us the way in this sort of Javascript. The script itself is, however, pretty amateurish–a notice attempt at what Altay did expertly.

As we put on the bookmarklet: “Google and Google Book Search are registered trademarks of Google. LibraryThing is not affiliated in any way with Google or the many libraries that have so generously provided Google with their books and bibliographic metadata, although we share a love of books, a desire to make information as freely available as possible, and similar opinions about evil.”

Labels: features, google, google book search, new feature, new features

0 Comments:

Leave a Reply