Archive for the ‘affinity percentiles’ Category

Friday, May 4th, 2007

Affinity percentiles and Altay

Altay (middle), John (sweatshirt), Tim (right), Abby (encased in her spherical “soul cage”)

We’re introducing an important new feature, but only just. The feature is called “affinity percentiles.” Basically, we show numbers next to other user’s names. These represent how “similar” your libraries is to theirs.

We’ve started it off on just one area of the site, the message pages in Talk (example). We plan to roll it out across the site, but not until we get a lot of feedback. I have a feeling some members will love it, but some won’t. This isn’t something we want to do lightly.

The number needs some explaining. (It may be too subtle, and we should fall back to a more straightforward “books shared.”) Basically, the higher the better. The person who shares the most books with you will have a 99%; the person who shares the least gets a 1%.

The percentage isn’t the number shared—65% does not mean a user shares 65% of their books; it means that the user shares more books than 65% of users. Two other factors come into play:

  • a member has to share five books to get an affinity percentile
  • “sharing” is weighed by book obscurity and library size. A user with 100 books, who shares 20 obscure books with you ranks much higher than a user with 10,000 books who shares some very popular novels.

Other features:

  • If you hover over the percentile, you’ll get the shared books. We’ve thought of having it actually show the books.
  • The percentile box is colored in line with the number—the hotter the higher.

Some questions:

  • Are the percentiles too hard to understand; would shared numbers be better
  • Is the weighting confusing?
  • What should happen when you hover over it? When you click on it?
  • Where should it go? Where shouldn’t it go?

How? I’ve wanted to do something like this for months. It’s a surprisingly difficult technical problem. You can’t calculate it on the fly every time, that would be insane. But caching the data gets big quick. Imagine a “Battleship” grid of users—190,000 by 190,000. If you stored a single byte for each connection–the number of shared books–it would amount to at least 16 terabytes of data (190,000 squared/2). The solution I came up with involves efficient short-term caching, and ignoring members with fewer than five shared books. We’ve actually been running it on the Talk pages since last night, waiting to make it visible until we knew it wouldn’t melt our servers. (So far no melt!)

You’ll notice the numbers aren’t there when you first hit the page. They come in a second or two later. This is “Ajax” at work, and was done to prevent the new feature from slowing Talk down.

The real benefits will come when the feature is distributed across the site. I’m particularly interested in seeing affinity percentages on reviews, and sorting by them. Ultimately, I don’t care what 300 people think about the Da Vinci Code. I want to know what Tim-ish people think of it.

Why?
The crux of the idea is to highlight what makes LibraryThing social system work, so-called “social cataloging.” Vanilla social networking is structured around “friends.” That’s a powerful idea, but it has limits. It can be too “binary”; and the dynamics of “friending” a stranger miss many of us. At its best, social cataloging gets at something more nuanced. If I share 50 books about ancient history with you, there’s a degree, a nuance and a semantics to the connection that opens up a world of possibilities. Some are social and some aren’t. I might want to chat with you about the books we’ve read, or I might not. Either way, I benefit. The rest of your library is probably interesting to me. And your opinions have a claim on my attention no anonymous guy on Amazon gets.

This post also introduces Altay Guvench (username: Altay), who did the Javascript work behind affinity percentiles. This was actually a toss-off, but Altay was the force behind the much more amazing Javascript in LibraryThing for Libraries. That stuff is a work of art—Javascript inserting Javascript. It might actually be self aware! Altay will be working on the site generally, with a tilt toward things that JavaScript can improve, like the widgets.

Altay in a nutshell: Portland native. Harvard undergrad. Bassist for the alt-country band Great Unknowns (toured with the Indigo Girls! Reviewed ecstatically. Listen to a free song!). Co-founder of Y-Combinator-funded startup AudioBeta. One of only three members on LibraryThing with Optical holography : principles, techniques, and applications. Scheme hacker. Nerd, but a nerd who rocks out.

Labels: affinity percentiles, altay, features, soul cages