Entity grouping for accessing social streams via word clouds

Martin Leginus*, Leon Derczynski, Peter Dolog

*Kontaktforfatter

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

Abstract

Word clouds have been proven as an effective tool for information access in different domains. As social media is a main driver of large increase in available user generated content, means for accessing information in such content are needed. We study word clouds as a means for information access in social media. Currently-used clouds that are generated from social media data include redundant and misranked entries, harming their utility.We propose a method for generating improved word clouds over social streams. In this method, named entities are detected, disambiguated and aggregated into clusters, which in turn inform cloud construction. We show that word clouds using named entity clusters attain broader coverage and decreased content duplication. Further, an extrinsic evaluation shows improved access to data, with word clouds having grouped named entities being rated more relevant and diverse. Additionally we find word clouds with higher Mean Average Precision (MAP) tend to be more relevant to underlying concepts. Critically, this supports MAP as a tool for predicting cloud quality without needing a human.

OriginalsprogEngelsk
TitelWeb Information Systems and Technologies : 11th International Conference, WEBIST 2015, Lisbon, Portugal, May 20–22, 2015, Revised Selected Papers
Antal sider22
ForlagSpringer
Publikationsdato2016
Sider3-24
ISBN (Trykt)978-3-319-30995-8
ISBN (Elektronisk)978-3-319-30996-5
DOI
StatusUdgivet - 2016
Begivenhed11th International Conference on Web Information Systems and Technologies, WEBIST 2015 - Lisbon, Portugal
Varighed: 20 maj 201522 maj 2015

Konference

Konference11th International Conference on Web Information Systems and Technologies, WEBIST 2015
Land/OmrådePortugal
ByLisbon
Periode20/05/201522/05/2015
SponsorEuropean Research Center for Information System (ERCIS)
NavnLecture Notes in Business Information Processing
Vol/bind246
ISSN1865-1348

Fingeraftryk

Dyk ned i forskningsemnerne om 'Entity grouping for accessing social streams via word clouds'. Sammen danner de et unikt fingeraftryk.

Citationsformater