TY - JOUR
T1 - HisRect
T2 - Features from Historical Visits and Recent Tweet for Co-Location Judgement
AU - Li, Pengfei
AU - Lu, Hua
AU - Zheng, Qian
AU - Li, Shijian
AU - Pan, Gang
PY - 2021
Y1 - 2021
N2 - Enabled by smartphones, social media users are increasingly going mobile. This trend fosters various location based services on social media platforms (e.g., Twitter). Many services like friends notification and community detection benefit from co-location judgement, i.e., to decide whether two Twitter users are co-located in some point-of-interest (POI). This problem is challenging due to the limited information in tweets and the lack of explicit geo-tags in tweets that can be used as labeled data. Our approach to this problem is based on a novel concept of HisRect features extracted from users' historical visits and recent tweets: The former has impacts on where a user visits in general, whereas the latter gives more hints about where a user is currently. In practice, labeled data is scarce. Therefore, we design a semi-supervised learning (SSL) framework that leverages unlabeled data to extract HisRect features. Moreover, we employ an embedding neural network layer to process HisRect features of two users, which decides co-location based on the embedding difference between the two features. Our model is extensively evaluated on two large sets of real Twitter data from more than one million users. The experimental results demonstrate that our HisRect features and SSL framework are highly effective at deciding co-locations. In terms of multiple metrics, our approach clearly outperforms alternative approaches using state-of-the-art techniques.
AB - Enabled by smartphones, social media users are increasingly going mobile. This trend fosters various location based services on social media platforms (e.g., Twitter). Many services like friends notification and community detection benefit from co-location judgement, i.e., to decide whether two Twitter users are co-located in some point-of-interest (POI). This problem is challenging due to the limited information in tweets and the lack of explicit geo-tags in tweets that can be used as labeled data. Our approach to this problem is based on a novel concept of HisRect features extracted from users' historical visits and recent tweets: The former has impacts on where a user visits in general, whereas the latter gives more hints about where a user is currently. In practice, labeled data is scarce. Therefore, we design a semi-supervised learning (SSL) framework that leverages unlabeled data to extract HisRect features. Moreover, we employ an embedding neural network layer to process HisRect features of two users, which decides co-location based on the embedding difference between the two features. Our model is extensively evaluated on two large sets of real Twitter data from more than one million users. The experimental results demonstrate that our HisRect features and SSL framework are highly effective at deciding co-locations. In terms of multiple metrics, our approach clearly outperforms alternative approaches using state-of-the-art techniques.
KW - POI
KW - Twitter
KW - co-location judgement
KW - semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85100586883&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2019.2934686
DO - 10.1109/TKDE.2019.2934686
M3 - Journal article
SN - 1041-4347
VL - 33
SP - 1005
EP - 1018
JO - I E E E Transactions on Knowledge & Data Engineering
JF - I E E E Transactions on Knowledge & Data Engineering
IS - 3
M1 - 8798877
ER -