relevance, classification and folksonomy

I recently attended a long now seminar by Clay Shirky on classification. He started with the difficulty of digital preservation and then went on to why prevalent categorization schemes don’t work in the long run. The recording of the seminar will probably be available at the long now site if you missed it. He posts his writings on his website as well.

The gist of his talk was that heirarchical classification as prevalently used today, in libraries, directories etc are a result of a core group of people modelling the usage behaviour of a larger group of people, which, in the long run messes up and miscategorizes things. The key to doing better classification is to have the larger group do the modelling itself, using degenerate linking between things, sort of like tags for flickr tags. In fact he actually proposed tags / folksonomy and free-text search engines as providing better value. You will have to view/listen to his talk to figure out the pitfalls of classification and the benefits of folksonomy. I’ll probably do a bad job of reporting it here.

I agree with all he says. However, I’d like to add that the primary purpose behind classification whether it’s heirarchical categories or folksonomy, is to be able to find and recall things easily later on. There are other benefits you can get but that’s beyond the scope of what I’m proposing. The desire or need to find things is driven by one key concept among many, the context you’re coming from and the relevancy of what you’re trying to find to that context. So, the question is what do we know about relevancy? Have studies been done on it to come up with theories of figuring out what relevancy is, how we understand it, how it works, how we can use it?

Google started on the right track with it’s page rank algorithm. That’s one kind of relevancy. Expanding on that, relationships between things/resources are more important to relevancy than the resources themselves. What about source of the resource, or the time it was created, or the context in which it was created. Context is harder to track though. That’s another science in itself. In fact, if we start tracking the value and composition of relevancy, it will feed into understanding context. And vice versa.

This article begs to be updated in the near future.