Efficient and Incremental Clustering Algorithms on Star-Schema Heterogeneous Graphs

Lu Chen, Yunjun Gao, Yuanliang Zhang, Christian S. Jensen, Bolong Zheng

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

1 Citationer (Scopus)

Abstrakt

Many datasets including social media data and bibliographic data can be modeled as graphs. Clustering such graphs is able to provide useful insights into the structure of the data. To improve the quality of clustering, node attributes can be taken into account, resulting in attributed graphs. Existing attributed graph clustering methods generally consider attribute similarity and structural similarity separately. In this paper, we represent attributed graphs as star-schema heterogeneous graphs, where attributes are modeled as different types of graph nodes. This enables the use of personalized pagerank (PPR) as a unified distance measure that captures both structural and attribute similarity. We employ DBSCAN for clustering, and we update edge weights iteratively to balance the importance of different attributes. To improve the efficiency of the clustering, we develop two incremental approaches that aim to enable efficient PPR score computation when edge weights are updated. To boost the effectiveness of the clustering, we propose a simple yet effective edge weight update strategy based on entropy. In addition, we present a game theory based method that enables trading efficiency for result quality. Extensive experiments on real-life datasets offer insight into the effectiveness and efficiency of our proposals, compared with existing methods.
OriginalsprogEngelsk
TitelProceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
Antal sider12
ForlagIEEE
Publikationsdato2019
Sider256-267
Artikelnummer8731611
ISBN (Trykt)978-1-5386-7475-8
ISBN (Elektronisk)978-1-5386-7474-1
DOI
StatusUdgivet - 2019
BegivenhedThe 35th IEEE International Conference on Data Engineering (ICDE) - Macau, Macau, Kina
Varighed: 8 apr. 201912 apr. 2019

Konference

KonferenceThe 35th IEEE International Conference on Data Engineering (ICDE)
LokationMacau
LandKina
ByMacau
Periode08/04/201912/04/2019
NavnProceedings of the International Conference on Data Engineering
ISSN1063-6382

Fingeraftryk Dyk ned i forskningsemnerne om 'Efficient and Incremental Clustering Algorithms on Star-Schema Heterogeneous Graphs'. Sammen danner de et unikt fingeraftryk.

  • Citationsformater

    Chen, L., Gao, Y., Zhang, Y., Jensen, C. S., & Zheng, B. (2019). Efficient and Incremental Clustering Algorithms on Star-Schema Heterogeneous Graphs. I Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019 (s. 256-267). [8731611] IEEE. Proceedings of the International Conference on Data Engineering https://doi.org/10.1109/ICDE.2019.00031