Wednesday, September 7th, 2005

Amazon and the Library of Congress: Together at last

Well, I think I’ve come up with the right solution, using Amazon and the LC together. It’s a little complicated, but the complexity is hidden from the user.

It works something like this. First it looks in the LC. If it can’t find it there—either because it’s not there or because the search didn’t follow LC rules—it goes to Amazon. If it finds it on Amazon, it makes one last heroic and generally successful effort to find it at the LC, this time using Amazon’s data in a LC lookup.

People who don’t care about LC data can structure their search as loosely as they want and will still end up with LC data most of the time. People who must have LC results can make sure they get them. If you have any doubt, the Add books screen tells you where the data came from for each book.

When there’s LC data, it tends to prefer it over Amazon data. This is because Amazon plays a bit loose with authors and titles. Authors are first-last sometimes, last-first others. Titles often include the name of the series the book belongs too. The LC is more careful. At the same time, it always uses Amazon date and publication info. This ensures that, although the LC may have an older edition, your info will match the book you clicked.

Inevitably the multiple sources hamper attempts to “match up” equivalent books. Right now it tends to match books up by LC control number (which can embrace two ISBNs) or by ISBN. In the future I’ll be doing a more sophisticated sameness test, involving titles, authors and other data. The same/different issue can never be solved fully, but I’ll try to strike a reasonable balance.

Confused? Don’t be. I think it works pretty well. Feel free to differ.

No response from Amazon yet. If they insist on freezing data and requiring constant refreshes, I will have to make some changes.

Labels: 1

0 Comments:

Leave a Reply