Archive for the ‘open library’ Category

Wednesday, November 18th, 2009

Your statistics: Ebooks and audiobooks

Our recent ebook push had one major flaw—something was up with the profile statistics page.

That’s been fixed, and the result is stunning. Instead of a few dozen ebooks, most users should see hundreds. My stats, for example, include fifteen LibriVox ebook editions, 45 Project Gutenberg editions and fully 99 Open Library editions—all free.

Check it out:

Labels: audiobooks, ebooks, kindle, open library, statistics

Tuesday, December 11th, 2007

Open data and the Future of Bibliographic Control

We’ve got until December 15th to submit comments on the draft report produced by the Working Group on the Future of Bibliographic Control.

No—keep reading! This is important. People in the library profession need to be involved in this stuff. Further, people outside the profession need to be involved too. As the report notices, library data is used by many outside the library world, starting with library patrons, and extending even to Amazon.com. It shouldn’t go unnoticed, for example, that draft report mentions LibraryThing four times. For while LibraryThing uses library data, it was invented by and is mostly used by non-librarians.

Aaron Swartz, the dynamo behind Open Library, sent me a note about one important aspect of the draft report, namely what it’s missing: It doesn’t mention open data. There is serious discussion about sharing, but also the alarming proposal that the LC attempt to recoup more money from the sale of it’s data. That’s a shame. I’m not alone in believing that open access to library data is the future. A report about the future should confront the future.

The economy of library records is a complex one but not primarily a free one. By and large libraries pay the Dublin, Ohio-based OCLC for their records, even if the records were created at government expense. That model looks increasingly dated. And it is killing innovation.

It hasn’t killed LibraryThing yet, but the specter has always hung over our head. It’s why LibraryThing has—so far—not pitched itself to small libraries. OCLC doesn’t care about personal cataloging, and the libraries we use are—in every conversation I’ve had—enthusiastic about what we do. They want their data out there; they’re libraries for Pete’s sake! But if we offered data to public libraries we’d be cutting into the OCLC profit model. That could be dangerous.

Aaron invited me to sign onto a list of people interested in the issue. I did so. I invite you—any of you—to do so as well. The text says it perfectly:

“Bibliographic records are part of our shared cultural heritage and should be made available to the public for re-use without restriction. This will allow libraries to share records more efficiently, but will also make possible more advanced online sites for book-lovers, easier analysis by social scientists, interesting visualizations and summary statistics by journalists and others, as well as many other possibilities we cannot predict in advance.”

“Government agencies and public institutions are increasingly making data open. We strongly encourage the Library of Congress to join this movement by recommending that more bibliographic data is made available for access, re-use and re-distribution without restriction.”

The petition is here: http://www.okfn.org/wiki/OpenBibliographicData .

Labels: library of congress, open data, open library, Working Group on the Future of Bibliographic Control

Monday, October 22nd, 2007

Google in the NYT; Aaron Swartz at Berkman

I just returned home from doing a talk in NYC*, so I only just read the front-page NYT story about libraries spurning Google’s scanning effort, and turning to the Internet Archive and the Open Content Alliance instead.

You may now insert five paragraphs of incisive discussion of this vexed topic. I’ve got opinions a-plenty. But why bother? I can’t do anything about them.

Aaron Swartz, the tech lead for the IA’s Open Library project, is a guy who can. And I’m going to see him tomorrow! Aaron is dropping by the Berkman Center to talk about Open Library.

Berkman scholar and regular on this blog, David Weinberger, gave me a heads-up, and I snagged a spot for myself and for Abby. I’m all keyed-up over it. I was involved in an early Open Library meeting and have followed it closely. Our recently-introduced “Common Knowledge” feature owes something to the Open Library vision, and has given us some insight into the promise and the problems Open Library will face as it grows.

Anyway, the event is at 12:30 Eastern Time. I don’t know if they still have spaces, but the whole thing will be webcast live (directions here), and archived for later viewing.

*NFAIS, it was fun.

Labels: aaron swartz, berkman center, open library, weinberger

Wednesday, October 10th, 2007

Common Knowledge: Social cataloging arrives

Chris has just released Common Knowledge, the innovative, open-data and insanely addictive “fielded wiki” we’ve been talking about for a month.

Common Knowledge adds fields to every author and work, like:

  • Author: Places of residence, Awards and honors, Agent
  • Work: Important places, Character names, Publisher’s editor, Description

All-told there are fourteen fields. But Common Knowledge is less a set of fields than a structure for adding fields to LibraryThing. Adding more fields is almost trivial, and they can be added to anything existing or planned—from tags and subjects, to bookstores and publishers. They can even be added to other Common Knowledge fields, so that, for example, agents and editors can, in the future, sport photos and contact information.* This can lead to, as Chris puts it, “nearly infinite cross-linking of data.”

Common Knowledge works like a wiki. Any member can add information, and any member can edit or revert edits. All fields are global, not personal. Common Knowledge diverges from a standard wiki insofar as each field works like its own independent wiki page, with a separate edit history.

Some example:

  • Jonathan Strange and Mr. Norrell. I’ve been conservative with characters and places. (See Longitude, worked on by Chris for the opposite approach.) But I wish I had her editor!
  • The history page for “important places” in Jonathan Strange and Mr. Norrell, showing improvement over time.
  • David Weinberger. Half-filled. He mentions his agent, but I can’t tree his major at Bucknell and the honors section is empty.
  • Hugo Award Winners. This is going to get very cool.
  • The global history page. Mesmerizing.

Right now we’re basically slapping fields on pages, but this structure is built for reuse. The license is also built for reuse. We’re not asking members to help us create a repository of saleable, private data. Whatever you add to Common Knowledge falls under a Creative Commons Attribution license. So long as you include a short notice (eg., “Powered by the LibraryThing community”), you can do almost anything you want with the data—take it, change it, remix it, give it to others. You can even sell it, if someone will buy it. Regular people, bookstores, libraries–even our competitors–are free to use it. We’ll be adding APIs to get it out there all the more. Go crazy, people.**

Common Knowledge isn’t the answer to everything. Some data, like web links, requires a more structured approach; some, like our “work” titles, works best when it “bubbles up” from user data; and some, like page counts, have yet to be extracted from the MARC and ONIX information we have. But the possibilities are great. Series information? Blurbers? Cover designers? Books about an author? Tag notes? Other classification schemes?*** Bookstore locations? Publicists? Venues? Book fairs? Pets? Pets’ vacination dates?

Anyway, we’ve done our thinking, but this is the ultimate member-input feature. We’re going to have to figure it out together. Fields will need to be added (and removed?). Rules will be debated, formatting discussed. Although the base is solid, the feature set is still skeletal.****

Go ahead and play. Chris, John and I spent the evening playing with it, and we guarantee it’s addictive. Or talk about. Leave a note here. I’ve also changed the WikiThing group into a Common Knowledge and WikiThing group. I’ve started a first-reactions topic and another for bug reports.

Why I’m excited. LibraryThing means a lot of things to a lot of people. Some come for the cataloging, some for the social aspect. A lot come for what happens between those two poles. As I see it, Common Knowledge is the perfect LibraryThing feature. I don’t mean it’s good; I mean it’s in tune with what makes LibraryThing work. It’s social, sure, but it’s based in data. It’s not private cataloging and it’s not MySpace-like “friending.”

LibraryThing is sometimes called a “social cataloging” site. When I used this term at the American Library Association, it became an unintentional laugh line. Social cataloging sounded impossible and funny, like feline water-skiing. This more than anything else got me fired up about doing this. True “social cataloging”; it was an idea that had to be tried!*****

Details, acknowledgements and caveats. Common Knowledge is deeply unstructured. This is going to give some members hives! Names aren’t in first-middle-last format, but free text. You can enter places however you want. We’ve arranged some careful “hint” text, and fields have a terrific “autocomplete” feature, but we’re not validating data and returning hostile error messages. We’re aiming for accessibility and reach, not perfection. This is Wikipedia, not the Library of Congress. It scares us too, but we’re also excited.

Abby, Casey, Chris and I planned this feature during the Week of Code. We worked through the issues together, and Casey, Chris and I all wrote the initial code. When we broke up, the rest of the coding and the interface design all fell to Chris. Although it was a team effort, this is really his feature. I’m very pleased with what he did with it.

We decided to work on this (and on our standard wiki, WikiThing, which grew out of it) because it was an ideal project for the entire group to tackle. This jumped it past collections. I still think this was a good idea, but there has certainly been some grumbling. We heard you. Collections is next on our list, with nothing new in between.


*So far we have only three data types—radio buttons (gender), long fields (book descriptions and author disambiguations) and short fields (everything else).
**Competitors who use it might want to stop asserting copyright over everything posted to their site. This was legally bogus already, but it certainly would conflict with a Creative Commons license… Incidentally, we haven’t decided whether to go with CC-Attribution Share-and-Share-Alike or straight CC-Attribution (discussion here), but it’s going to be one or the other.
***This particular one may happen very soon.
****And yes, we can discuss the whole radio-buttons-for-gender topic. See here, here. I’m of the opinion that two genders plus maybe “unknown” and “n/a” (for Nyarlathotep?) are the best you can get without consensus-splitting disagreement. You’ll note we aren’t including other potentially-contentious fields, like sexual orientation or religion.
*****In conception, Common Knowledge most closely resembles the Open Library Project, the Internet Archive‘s incipent effort to “wikify” the library catalog. Open Library is also a “fielded wiki,” based on Aaron Schwartz’s superior Infogami platform. You’ll notice that we’ve mostly steered clear of the “traditional” cataloging fields that Open Library is starting from. We do cataloging differently, and we don’t want to duplicate effort. Anyway, we’re hoping they and others mash up the two data sets, and others.

Labels: common knowledge, creative commons, fielded wiki, new feature, new features, open library, wiki

Monday, September 10th, 2007

WikiThing: A wiki for LibraryThing

We’ve had the whole team up in Portland, ME, getting to know each other, brainstorming, planning and working on projects. We chose two projects to work on all together. We wanted something that could engage the talents of the whole team.

The first release is WikiThing*, a full-featured wiki for LibraryThing. A wiki is, of course, “a collaborative website which can be directly edited by anyone.” You can use them for lots of things. Wikipedia is an encyclopedia. DiscourseDB tracks published opinion pieces. So what’s WikiThing for?

We’re not sure! But we’re kicking it off with:

  • FAQ. We’ve put our static Frequently Asked Questions pages up on the wiki, where everyone (including us) can edit them. If it works out, we’ll get rid of the static pages, or reduce them to a few questions, and link to WikiThing.
  • Help. We’ve got a few Help pages that aren’t FAQ pages.
  • Bug tracking. This was a tough one. We do not want to move all bug conversations to the wiki. Bug tracking can seem like a simple record, but it is generally a conversation, with questions and answers back and forth. Feature requests are even more so. At the same time, a simple list of bugs, with links to Talk posts, could be a big help for everyone.

What do you want to do with it? Leave a note here or on the Talk: New Features post about ThingWiki.

How do I do it? Editing is super easy. Just go to a wiki page and click the “edit” link at the top, or one of the “edit” links by a section.

WikiThing is based on the MediaWiki engine, the same software that runs Wikipedia. So, if you know how to edit Wikipedia, you know how to edit WikiThing. If you don’t, it’s easy to learn. Mostly you just type. If you need to do something fancy, like insert a link, we have a Wiki help. If you screw up, don’t worry. Someone else will come along and fix it.

What about a “content” wiki? We thought long and hard about having a “content wiki.” A content wiki would have wiki pages for all works, authors and so forth. It would cover often-requested fields, like the year of original publication for a work and series information, and hitherto unrequested ones, like the name of the acquiring/literary editor. Members would be able to edit them and the edits would get picked up and put on work and author pages.

After a lot of thought and experimentation we decided that MediaWiki wasn’t the right tool for the job**. We needed a true “fielded wiki.” We looked at options like Aaron Swartz‘s Python-based Infogami, which also runs Open Library.****

In the end, we decided to do it ourself, and it turned out easier than we thought.

We’ve got one more day together, and plan to make the most of it. Whether we can finish it up today or now, we should get it out this week.


*I was overuled on the name. I wanted ThingWiki, in keeping with ThingISBN, ThingTitle and so forth. Casey and Chris** were against it.
**The individual formerly known as “Christopher” (ConceptDawg) shall henceforth be known as “Chris.” Although friends call him Chris, we were calling him Christopher because we also had a Chris (Chris Gann), but Chris Gann is long gone, and Chris—the Christopher Chris—wants his name back! Who’s on first?
***We also decided that tools like Semantic MediaWiki and WikiForms weren’t there yet.
****Since Infogami runs ThingDB—yes, he used the name first—we were thinking of calling our product ThingGami!

Labels: fielded wiki, infogmi, new feature, new features, open library, wiki, wikithing