ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling

Zhen Song; Yu Gu; Tianyi Li; Qing Sun; Yanfeng Zhang; Christian S. Jensen; Ge Yu

doi:10.1145/3626716

ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling

Zhen Song, Yu Gu, Tianyi Li, Qing Sun, Yanfeng Zhang, Christian S. Jensen, Ge Yu

Research output: Contribution to journal › Conference article in Journal › Research › peer-review

Abstract

Distributed computing is promising to enable large-scale graph neural network (GNN) model training. However, care is needed to avoid excessive computational and communication overheads. Sampling is promising in terms of enabling scalability, and sampling techniques have been proposed to reduce training costs. However, online sampling introduces large overheads, and while offline sampling that is done only once can eliminate such overheads, it instead introduces information loss and accuracy degradation. Thus, existing sampling techniques are unable to improve simultaneously both efficiency and accuracy, particularly at low sampling rates. We develop a distributed system, ADGNN, for full-batch based GNN training that adopts a hybrid sampling architecture to enable a trade-off between efficiency and accuracy. Specifically, ADGNN employs sampling result reuse techniques to reduce the cost associated with sampling and thus improve training efficiency. To alleviate accuracy degradation, we introduce a new metric,Aggregation Difference (AD), that quantifies the gap between sampled and full neighbor set aggregation. We present so-called AD-Sampling that aims to minimize the Aggregation Difference with an adaptive sampling frequency tuner. Finally, ADGNN employs anAD -importance-based sampling technique for remote neighbors to further reduce communication costs. Experiments on five real datasets show that ADGNN is able to outperform the state-of-the-art by up to nearly 9 times in terms of efficiency, while achieving comparable accuracy to the non-sampling methods.

Original language	English
Article number	229
Journal	Proceedings of the ACM on Management of Data
Volume	1
Issue number	4
Pages (from-to)	1-26
ISSN	2836-6573
DOIs	https://doi.org/10.1145/3626716
Publication status	Published - 2023

Access to Document

10.1145/3626716

https://dblp.org/rec/journals/pacmmod/SongGLSZJY23

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{8f9b93121cc5443c8b03a0c26ac29899,

title = "ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling",

abstract = "Distributed computing is promising to enable large-scale graph neural network (GNN) model training. However, care is needed to avoid excessive computational and communication overheads. Sampling is promising in terms of enabling scalability, and sampling techniques have been proposed to reduce training costs. However, online sampling introduces large overheads, and while offline sampling that is done only once can eliminate such overheads, it instead introduces information loss and accuracy degradation. Thus, existing sampling techniques are unable to improve simultaneously both efficiency and accuracy, particularly at low sampling rates. We develop a distributed system, ADGNN, for full-batch based GNN training that adopts a hybrid sampling architecture to enable a trade-off between efficiency and accuracy. Specifically, ADGNN employs sampling result reuse techniques to reduce the cost associated with sampling and thus improve training efficiency. To alleviate accuracy degradation, we introduce a new metric,Aggregation Difference (AD), that quantifies the gap between sampled and full neighbor set aggregation. We present so-called AD-Sampling that aims to minimize the Aggregation Difference with an adaptive sampling frequency tuner. Finally, ADGNN employs anAD -importance-based sampling technique for remote neighbors to further reduce communication costs. Experiments on five real datasets show that ADGNN is able to outperform the state-of-the-art by up to nearly 9 times in terms of efficiency, while achieving comparable accuracy to the non-sampling methods.",

author = "Zhen Song and Yu Gu and Tianyi Li and Qing Sun and Yanfeng Zhang and Jensen, {Christian S.} and Ge Yu",

note = "DBLP's bibliographic metadata records provided through http://dblp.org/search/publ/api are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.",

year = "2023",

doi = "10.1145/3626716",

language = "English",

volume = "1",

pages = "1--26",

journal = "Proceedings of the ACM on Management of Data",

issn = "2836-6573",

publisher = "Association for Computing Machinery",

number = "4",

}

TY - GEN

T1 - ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling

AU - Song, Zhen

AU - Gu, Yu

AU - Li, Tianyi

AU - Sun, Qing

AU - Zhang, Yanfeng

AU - Jensen, Christian S.

AU - Yu, Ge

N1 - DBLP's bibliographic metadata records provided through http://dblp.org/search/publ/api are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.

PY - 2023

Y1 - 2023

N2 - Distributed computing is promising to enable large-scale graph neural network (GNN) model training. However, care is needed to avoid excessive computational and communication overheads. Sampling is promising in terms of enabling scalability, and sampling techniques have been proposed to reduce training costs. However, online sampling introduces large overheads, and while offline sampling that is done only once can eliminate such overheads, it instead introduces information loss and accuracy degradation. Thus, existing sampling techniques are unable to improve simultaneously both efficiency and accuracy, particularly at low sampling rates. We develop a distributed system, ADGNN, for full-batch based GNN training that adopts a hybrid sampling architecture to enable a trade-off between efficiency and accuracy. Specifically, ADGNN employs sampling result reuse techniques to reduce the cost associated with sampling and thus improve training efficiency. To alleviate accuracy degradation, we introduce a new metric,Aggregation Difference (AD), that quantifies the gap between sampled and full neighbor set aggregation. We present so-called AD-Sampling that aims to minimize the Aggregation Difference with an adaptive sampling frequency tuner. Finally, ADGNN employs anAD -importance-based sampling technique for remote neighbors to further reduce communication costs. Experiments on five real datasets show that ADGNN is able to outperform the state-of-the-art by up to nearly 9 times in terms of efficiency, while achieving comparable accuracy to the non-sampling methods.

AB - Distributed computing is promising to enable large-scale graph neural network (GNN) model training. However, care is needed to avoid excessive computational and communication overheads. Sampling is promising in terms of enabling scalability, and sampling techniques have been proposed to reduce training costs. However, online sampling introduces large overheads, and while offline sampling that is done only once can eliminate such overheads, it instead introduces information loss and accuracy degradation. Thus, existing sampling techniques are unable to improve simultaneously both efficiency and accuracy, particularly at low sampling rates. We develop a distributed system, ADGNN, for full-batch based GNN training that adopts a hybrid sampling architecture to enable a trade-off between efficiency and accuracy. Specifically, ADGNN employs sampling result reuse techniques to reduce the cost associated with sampling and thus improve training efficiency. To alleviate accuracy degradation, we introduce a new metric,Aggregation Difference (AD), that quantifies the gap between sampled and full neighbor set aggregation. We present so-called AD-Sampling that aims to minimize the Aggregation Difference with an adaptive sampling frequency tuner. Finally, ADGNN employs anAD -importance-based sampling technique for remote neighbors to further reduce communication costs. Experiments on five real datasets show that ADGNN is able to outperform the state-of-the-art by up to nearly 9 times in terms of efficiency, while achieving comparable accuracy to the non-sampling methods.

U2 - 10.1145/3626716

DO - 10.1145/3626716

M3 - Conference article in Journal

SN - 2836-6573

VL - 1

SP - 1

EP - 26

JO - Proceedings of the ACM on Management of Data

JF - Proceedings of the ACM on Management of Data

IS - 4

M1 - 229

ER -

ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling

Abstract

Access to Document

AUB Link

Fingerprint

Cite this