Location Inference for Non-geotagged Tweets in User Timelines

Pengfei Li, Hua Lu, Nattiya Kanhabua, Sha Zhao, Gang Pan

Research output: Contribution to journalJournal articleResearchpeer-review

27 Citations (Scopus)
605 Downloads (Pure)

Abstract

Social media like Twitter have become globally popular in the past decade. Thanks to the high penetration of smartphones, social media users are increasingly going mobile. This trend has contributed to foster various location based services deployed on social media, the success of which heavily depends on the availability and accuracy of users' location information. However, only a very small fraction of tweets in Twitter are geo-tagged. Therefore, it is necessary to infer locations for tweets in order to attain the purpose of those location based services. In this paper, we tackle this problem by scrutinizing Twitter user timelines in a novel fashion. First of all, we split each user's tweet timeline temporally into a number of clusters, each tending to imply a distinct location. Subsequently, we adapt two machine learning models to our setting and design classifiers that classify each tweet cluster into one of the pre-defined location classes at the city level. The Bayes based model focuses on the information gain of words with location implications in the user-generated contents. The convolutional LSTM model treats user-generated contents and their associated locations as sequences and employs bidirectional LSTM and convolution operation to make location inferences. The two models are evaluated on a large set of real Twitter data. The experimental results suggest that our models are effective at inferring locations for non-geotagged tweets and the models outperform the state-of-the-art and alternative approaches significantly in terms of inference accuracy.

Original languageEnglish
Article number8403245
JournalI E E E Transactions on Knowledge & Data Engineering
Volume31
Issue number6
Pages (from-to)1150-1165
Number of pages16
ISSN1041-4347
DOIs
Publication statusPublished - 2019

Keywords

  • Adaptation models
  • Bayes
  • Feature extraction
  • Hidden Markov models
  • LSTM
  • Location Inference
  • Location awareness
  • Twitter
  • Urban areas

Fingerprint

Dive into the research topics of 'Location Inference for Non-geotagged Tweets in User Timelines'. Together they form a unique fingerprint.

Cite this