Archive for the ‘google book search’ Category

Thursday, March 13th, 2008

Google Books in LibraryThing

The official Google Blog and the Inside Book Search Blog just announced the new Google Book Search API, with LibraryThing as one of the first implementors. (The others are libraries; I’ll be posting about what they’ve done over on Thingology.)

In sum, LibraryThing now links to Google Books for book scans—full or partial—and book information.

Google Book Search links can be seen two places:

  • In your catalog. Choose “edit styles” to add the column. The column reflects only the exact edition you have.
  • On work pages. The “Buy, borrow, swap or view” box on the right now includes a Google Books section. Clicking on it opens up a “lightbox” showing all the editions LibraryThing can identify on Google Book Search.

Despite the screenshot, of Carroll’s Through the looking glass and what Alice found there, relatively few works have “full” scans. “Partial view” and “book information” pages are more common. But the former generally include sthe cover and table of contents, and the whole text can be searched. The latter can also be useful for cataloging purposes. Members with extensive collections from before 1923—the copyright cutoff—will get relatively more out the feature.

Leave comments here, or come discuss the feature on Talk.

Limitations. The GBS API is a big step forward, but there are some technical limitations. Google data loads after the rest of the page, and may not be instant. Because the data loads in your web browser, with no data “passing through” LibraryThing servers, we can’t sort or search by it, and all-library searching is impossible. You can get something like this if you create a Google Books account, which is, of course, the whole point.

LCCN and OCLC. To get the best results, we needed to add full access to two library standards, namely Library of Congress Control Numbers (LCCN) and OCLC Numbers. We did so, reparsing the original MARC records where necessary. You can see these columns in your catalog now—choose “edit styles” as above. The two columns are not yet editable, but will be so in a day or two.

The Back Story. The rest of the first batch are libraries, including a number of “friends”–Deschutes Public Library, the Waterford Institute of Technology, the University of Huddersfield and Plymouth State/Scriblio. Google wanted help finding potentials and if there’s one thing I have it’s a Rolodex of smoking-hot library programmers! Once I’ve taken in all the neat things they did, I’ll be posting over on Thingology.

Some libraries have chosen to feature Google Book Search links only when Google has the full scan. This makes sense to me. Linking to a no scans or partial scan, when the library has the item on its shelves, seems weird to me.

LibraryThing and its members can also like to take credit for moving the API along in another way. Your help with the Google Book Search Search bookmarklet forced the issue of GBS data. The message to Google was clear: our members wanted to use GBS with LibraryThing, and if Google wouldn’t provide the information, members would get it themselves. After some to-and-fro with Google, we voluntarily disabled the service. But I think it moved the openness ball a few feet, and that’s something for members to be proud of.

Labels: gbs, google, google book search

Thursday, September 20th, 2007

Link LibraryThing accounts to Google?

As I said in my talk post, we have spoken to Google about how to link and search Google Book Search reliably and effectively from LibraryThing.

Unfortunately, I am not at liberty to discuss much more than that. I can say that there is no substance to the rumor that Google is re-engineering CueCats to beam targeted advertisements onto your bedroom wall. I am also able to concede that the press accurately reported how Larry and Sergey beat me at drunken thumb-wrestling. But I cannot comment on whether Abby, sober and wielding a hitherto-unnoticed sixth finger, restored LibraryThing’s honor.

Here’s a hypothetical proposal. We could basically do this now, without Google’s help. And maybe Google could help.

Imagine if LibraryThing members could search across their books using Google BookSearch. That would be great, right?

But to do it, members would have to link their books to their Google account, connecting what they’ve cataloged on LibraryThing to the account that unites GMail, Blogger, Google Reader, Google Talk, Orkut, and the rest. And, by doing this, they would also connect their reading to their Google search history.

If this were to happen, connecting your LibraryThing and Google accounts would be voluntary, but searching your library all together would require that link, and require Google having all of your books from LibraryThing. I’m not sure what, if anything, Google would do with this information—perhaps nothing—but the option would be there.

What do people feel about this? Would you do it. Would allowing some absolutely private books to stay on LT help? What would make this work or not work?

Labels: drunken thumb wrestling, google book search, privacy, sexdactyly

Tuesday, September 18th, 2007

Google Book Search Search…

I am voluntarily and temporarily suspending Google Book Search Search, our effort to distribute the task of collecting Google books IDs for LibraryThing members through a browser “bookmarket.”

I am talking with Google about some other approaches that they might be able to simplify the process of linking to Book Search pages. Google has communicated their desire to make it easy for sites like LibraryThing and libraries to link to Google books appropriately and successfully.

The GBSS bookmarklet showed the power of the LibraryThing community. In about a day more than 1,500 LibraryThing members (and many non-members) installed the bookmarklet and collected GBS link data for over 253,000 of theirs and others books. If we had solved more of the browser issues, I’m sure we would have collected many more.

The links members discovered will be kept, and the data is available. We will be adding new tools for members to edit and add Google book ID information by hand, if necessary.

As you may guess, we are going to be doing some listening, some talking and some thinking. I would be grateful for your continued support as we work through this.

Labels: features, google book search

Monday, September 17th, 2007

Google Book Search … on LibraryThing

Introducing something new we’re calling “Google Book Search Search.”

Google Book Search Search is a bookmarklet that searches Google Book Search for the titles in your LibraryThing library. It works not unlike the famous SETI@Home project. You set it up and searches Google Book Search slowly in the background.* You can watch, do something in another window or go out for coffee.

When it’s done you can link to and search all the books in your library that Google has scanned. You’ll find a “search this book” link on work pages, and a Google Book Search field to add to the list view in your catalog.

But this isn’t just a selfish thing. There’s a lot of searching to do, and you can help. If you choose, you can pitch in and help with others’ books. All of the data gathered is free and available to everyone. A lot of people want a reliable index of what Google has, not least libraries.

What do I do?

Google Book Search Search is a “bookmarklet.” You save it to your “favorites” or “bookmarks.” Then you got to Google Book Search and you click it. You can see what pops up on the right.*** Press start and it will start collecting information.

Here it is: Google Book Search Search

We’ve tested it on FF and Safari on the Mac, and FF and IE7 and IE5.5 on the PC. We haven’t tested it on PC IE6 yet. I have no idea about Opera.

Why a bookmarklet?

We’ve wanted to do this for a long time. But to link to a book on Google reliably you need its Google ID. For some reason Google doesn’t publish these, making it impossible to tell what they have and what they don’t, and impossible for sites like LibraryThing to send them the traffic they want. Secretive and self-defeating? Seems like it to me.

Efforts have been made to collect Google IDs before. The well-known Lib 2.0 blogger John Blyberg tried, as have others. We tried too. The trick is that Google Book Search—like the rest of Google—has a system in place to stop machine queries.**

Making a bookmarklet distributes the work. And because it takes place within a browser, it tends not to trigger machine-collection warnings.

Ultimately, however, Google can put a stop to this. The bookmarklet has a signature. And Google can send us a note, and we’ll disable the bookmarklets. Just as Google respects the robots.txt file, we’ll respect such a request.

Why not use “My Library”?

Last week Google introduced an interesting “My Library” feature, allowing people with Google accounts to list some of their books. A few tech bloggers saw an attack on LibraryThing.

LibraryThing members were quick to dismiss it. It wasn’t so much the lack of any social features, or of cataloging features as basic as sorting your books. It wasn’t even the privacy issues, although these gave many pause. It was the coverage.

Google just doesn’t have the sort of books that regular people have. Most of their books come from a handful of academic libraries, and academic libraries don’t have the same editions regular people have. Then there are the books publishers have explicitly removed from Google Book Search. Success rates of below 50% were common. Of these a high percentage are only “limited preview” or “no preview.”

The Google-kills-LibraryThing meme has another dimension. We WANT people to use Google Book Search. It’s a great tool. Being able to search your own books is useful, and LibraryThing members should be able to do it. Call us naive, but we aren’t going to be able to “pretend Google isn’t there.” And we aren’t convinced that Google is going to create the sort of robust cataloging and social networking features that LibraryThing has.

Our bookmarklet works by transcending ISBNs, using what LibraryThing knows about titles, authors and dates to fetch other editions of a work. In limited tests I’ve found it picks up around 90% of LibraryThing titles.

Information wants to be free

Our commitment to open data is long-standing. We’ve railed against OCLC for its desire to lock up book metadata.

But we’re not railing here. We think it’s perfectly fine for Google to control access to the scans it’s made. All we want to do is link to them, to send them traffic. It’s not clear to us that Google is trying to control access to its ID numbers.

You can see and edit the data here. Full XML downloads of the data are also available there.


*Come to think of it, it works like Google.
**The system is overzealous. It often refuses to show me Google Blog Search pages in Firefox because I look at LibraryThing’s blog coverage too much.
***It’s quite amazing what a bookmarklet can do. We could have never done it if Altay hadn’t shown us the way in this sort of Javascript. The script itself is, however, pretty amateurish–a notice attempt at what Altay did expertly.

As we put on the bookmarklet: “Google and Google Book Search are registered trademarks of Google. LibraryThing is not affiliated in any way with Google or the many libraries that have so generously provided Google with their books and bibliographic metadata, although we share a love of books, a desire to make information as freely available as possible, and similar opinions about evil.”

Labels: features, google, google book search, new feature, new features