« "Extracting Key terms from Noisy and Multi-theme documents" - in simple words | Main | Poster "Analysis of Community Structure in Wikipedia" at WWW2009 »

May 16, 2009


Feed You can follow this conversation by subscribing to the comment feed for this post.

Elliot Turner

An interesting approach! Amazon Turk seems to be increasingly leveraged as a source of machine learning training data, linkages, etc.

Another worthwhile variant of this technique would be to combine Turk-based human annotators with an automated suggestion tool, along the lines of ODDLinker (Used to interlink the LinkedMDB project with dbpedia).

Tools such as ODDLinker that leverage tuple data to generate potential linkages can alleviate much of the human legwork for "obvious linkages", leaving manual disambiguation/lookups for the more ambiguous entries. Combining these sort of tools with workflow systems such as Amazon Turk has the potential to bring the Crunchbase annotation/linkiong costs down significantly.

Maria Grineva

Thanks Elliot! It is a reasonable suggestion to make some more obvious things automatically in order to bring costs down.

Can you give me links to ODDLinker tool?

Enrique Gomez

I wonder if your idea of using human annotation could be used against the twitter data through twitter itself. Consider that a human could add content to a twitter post, especially when referring to a linked article, through the use of a new kind of twitter tag. The content added would raise or lower the relevance according to the user. You could then 'sift' through the twitter data and extract a metric of relevance based on the new tag. The Turk humans are rated for effectiveness and therefore more reliable but in the twitter scenario you would leverage the larger mass of annotators. Hopefully some good data would bubble out of the noise.

I'm still thinking on how the tags would look like and a simple syntax. Something like the RT tag, say ReleVance. One could RV + or RV - if you find the whole post relevant or not. Maybe RV +term if you want to add a term you think applies to the post, RV -term to remove, etc.

Maybe I just need more coffee...

Maria Grineva

I didn't understand what you mean "relevant"? Do you mean, if I post something to Twitter I would rate how is the posted link relevant to me?

Volker  from Germany

I randomly came on this side and would like to leave nice greetings. I would be glad if you visit my homepage also! Maybe you want to visit Sylt Westerland in Germany http://www.MeerblickSylt.de or the Baltic Sea http://www.OstseeblickHolm.de for vacation?! We have there very nice flats with a nice view. Maybe we'll see soon!
Kind regards

volvo seat covers

There's some padding between the front and back layers of the cover, and that held moisture.

Bookkeeper Caloundra

Wikipedia is one of the most important wbsites.Here, you will find wealth of information.Wikipedia contributed so much to the information and knwledge based world.

The comments to this entry are closed.

Become a Fan