Archive for the ‘languages’ Category

Wednesday, June 24th, 2009

Reviews in many languages

I’ve added a bunch of features around the language that members write reviews in.

Reviews by language. The result is to make LibraryThing more attractive for non-English users—they now get reviews in their own language by default. A few languages, especially our Dutch, French and German sites, already have a decent number of reviews, and this should make it more fun for all non-English users to review books.

For the English-only members, the feature is mostly negative—it’s now easy to screen out the clutter of reviews in languages you don’t understand.

Most popular works have reviews in other languages. Something like the Da Vinci Code has reviews in thirteen languages, including twelve in Dutch, three in Swedish, two in Catalan and one in Greek! (“Un dels millors llibres que he llegit mai”, “Το λάτρεψα”—maybe it’s better in translation!)

Reviews uClassified: Most reviews have already been assigned to a language. Rather than use the default language in LibraryThing profiles, which turns out to be very, very weakly related to the language members write their reviews in, I took advantage of the excellent language classification service offered by uClassify (uClassify.com). uClassify runs a Bayesian filter on a piece of text and sends back a list of languages, and confidence scores.

It isn’t perfect, but it’s pretty good. Only very high scores were accepted as definitive. Short reviews weren’t sent for the same. As a result, about 1/8 of LibraryThing’s 730,267 reviews remain as “not set.”

Feature changes. A bunch.

  • You can now edit your reviews language everywhere you can edit or enter a review.
  • Your library statistics page (link) now shows how many reviews you’ve written in every language. Mostly importantly this shows the number of reviews that haven’t been assigned to a language.
  • For reviews going forward your default language is set on your account page.
  • The catalog now has a “Reviews language” field and a special search for all your reviews in a given language (eg., reviews in English, language not set). These links are available from your stats page).
  • You can Power Edit review languages, and when you’re looking at all your reviews in a language, if it differs from your default language, you will get a link to make all unset reviews be in your default language. For example, here are all your unset reviews (link).

Statistics. The numbers turned out something like this.

English/Unset: 650,988
Dutch: 8,636
French: 4,666
German: 4,651
Spanish: 4,463
Italian: 2,876
Swedish: 2,329
Danish: 1,587
Norwegian: 1,231
Portuguese: 1,098
Finnish: 662
Catalan: 443
Etc.

To be done, talked about. As usual, there’s more to do. So far, there’s no good list of recent or top reviews by language. Come to discuss it on Talk and suggest other improvements.

Labels: book reviews, catalan, french, german, greek, languages, new feature, new features

Sunday, December 16th, 2007

Fifteen new languages

The non-English LibraryThings are flourishing. Every day we move closer to the dream of a truly international community of book lovers—contributing to the community even when we don’t speak the same language.* Good sources have been critical. We’re going release a flurry of Spanish ones on Monday, and hundreds more in many languages are forthcoming soon. Equally important has been all the effort members have put into the translations. Participation has been really astounding—202 members have made at least 20 edits each. A few languages have been shouldered by a single member—moriarty with Albanian or avitkauskas with Lithuanian—but most have been a group endeavor.

At least a dozen languages are ready for general use. It’s time to introduce some more!

By and large, the languages above correspond to languages we hope to support with one or more sources. In some cases, as Armenian, we haven’t found a source yet, but we’re hopeful. In some cases, as with Korean, we haven’t yet figured out how to make our source work, but we haven’t exhausted our options. As always, we need help finding open Z39.50 connections.

PS: Don’t forget Basque. It’s still almost untranslated. We’ll be releasing a largely Basque-language library on Monday too.

*Notably, LibraryThing’s work system means that when it comes to a book that crosses boundaries, everyone counts. That is, if Albanian readers of Heinlein also enjoy Alfred Bester, that will count when it comes time to generate recommendations. Speaking of which, we have a site-wide re-think of recommendations going on. So, expect bumps.

Labels: languages, new feature, new features, new langauges