TY - GEN
T1 - Entity grouping for accessing social streams via word clouds
AU - Leginus, Martin
AU - Derczynski, Leon
AU - Dolog, Peter
PY - 2016
Y1 - 2016
N2 - Word clouds have been proven as an effective tool for information access in different domains. As social media is a main driver of large increase in available user generated content, means for accessing information in such content are needed. We study word clouds as a means for information access in social media. Currently-used clouds that are generated from social media data include redundant and misranked entries, harming their utility.We propose a method for generating improved word clouds over social streams. In this method, named entities are detected, disambiguated and aggregated into clusters, which in turn inform cloud construction. We show that word clouds using named entity clusters attain broader coverage and decreased content duplication. Further, an extrinsic evaluation shows improved access to data, with word clouds having grouped named entities being rated more relevant and diverse. Additionally we find word clouds with higher Mean Average Precision (MAP) tend to be more relevant to underlying concepts. Critically, this supports MAP as a tool for predicting cloud quality without needing a human.
AB - Word clouds have been proven as an effective tool for information access in different domains. As social media is a main driver of large increase in available user generated content, means for accessing information in such content are needed. We study word clouds as a means for information access in social media. Currently-used clouds that are generated from social media data include redundant and misranked entries, harming their utility.We propose a method for generating improved word clouds over social streams. In this method, named entities are detected, disambiguated and aggregated into clusters, which in turn inform cloud construction. We show that word clouds using named entity clusters attain broader coverage and decreased content duplication. Further, an extrinsic evaluation shows improved access to data, with word clouds having grouped named entities being rated more relevant and diverse. Additionally we find word clouds with higher Mean Average Precision (MAP) tend to be more relevant to underlying concepts. Critically, this supports MAP as a tool for predicting cloud quality without needing a human.
KW - Recognized named entities
KW - Social media
KW - Social stream access
KW - User evaluation
KW - Word clouds
UR - http://www.scopus.com/inward/record.url?scp=84961762803&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-30996-5_1
DO - 10.1007/978-3-319-30996-5_1
M3 - Article in proceeding
AN - SCOPUS:84961762803
SN - 978-3-319-30995-8
T3 - Lecture Notes in Business Information Processing
SP - 3
EP - 24
BT - Web Information Systems and Technologies
PB - Springer
T2 - 11th International Conference on Web Information Systems and Technologies, WEBIST 2015
Y2 - 20 May 2015 through 22 May 2015
ER -