Entity grouping for accessing social streams via word clouds

Martin Leginus*, Leon Derczynski, Peter Dolog

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

Word clouds have been proven as an effective tool for information access in different domains. As social media is a main driver of large increase in available user generated content, means for accessing information in such content are needed. We study word clouds as a means for information access in social media. Currently-used clouds that are generated from social media data include redundant and misranked entries, harming their utility.We propose a method for generating improved word clouds over social streams. In this method, named entities are detected, disambiguated and aggregated into clusters, which in turn inform cloud construction. We show that word clouds using named entity clusters attain broader coverage and decreased content duplication. Further, an extrinsic evaluation shows improved access to data, with word clouds having grouped named entities being rated more relevant and diverse. Additionally we find word clouds with higher Mean Average Precision (MAP) tend to be more relevant to underlying concepts. Critically, this supports MAP as a tool for predicting cloud quality without needing a human.

Original languageEnglish
Title of host publicationWeb Information Systems and Technologies : 11th International Conference, WEBIST 2015, Lisbon, Portugal, May 20–22, 2015, Revised Selected Papers
Number of pages22
PublisherSpringer
Publication date2016
Pages3-24
ISBN (Print)978-3-319-30995-8
ISBN (Electronic)978-3-319-30996-5
DOIs
Publication statusPublished - 2016
Event11th International Conference on Web Information Systems and Technologies, WEBIST 2015 - Lisbon, Portugal
Duration: 20 May 201522 May 2015

Conference

Conference11th International Conference on Web Information Systems and Technologies, WEBIST 2015
Country/TerritoryPortugal
CityLisbon
Period20/05/201522/05/2015
SponsorEuropean Research Center for Information System (ERCIS)
SeriesLecture Notes in Business Information Processing
Volume246
ISSN1865-1348

Keywords

  • Recognized named entities
  • Social media
  • Social stream access
  • User evaluation
  • Word clouds

Fingerprint

Dive into the research topics of 'Entity grouping for accessing social streams via word clouds'. Together they form a unique fingerprint.

Cite this