BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

Miao Zhang; Shirui Pan; Xiaojun Chang; Steven Su; Jilin Hu; Gholamreza Haffari; Bin Yang

doi:10.1109/CVPR52688.2022.01157

BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

Miao Zhang, Shirui Pan, Xiaojun Chang, Steven Su, Jilin Hu, Gholamreza Haffari, Bin Yang

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

7 Citationer (Scopus)

20 Downloads (Pure)

Abstract

Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation. However, more recent works find that existing differentiable NAS techniques struggle to outperform naive baselines, yielding deteriorative architectures as the search proceeds. Rather than directly optimizing the architecture parameters, this paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions. By leveraging the natural-gradient variational inference (NGVI), the architecture distribution can be easily optimized based on existing codebases without incurring more memory and computational consumption. We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability. The experimental results on NAS benchmark datasets confirm the significant improvements the proposed framework can make. In addition, instead of simply applying the argmax on the learned parameters, we further leverage the recently-proposed training-free proxies in NAS to select the optimal architecture from a group architectures drawn from the optimized distribution, where we achieve state-of-the-art results on the NAS-Bench-201 and NAS-Bench-1shot1 benchmarks. Our best architecture in the DARTS search space also obtains competitive test errors with 2.37%, 15.72%, and 24.2% on CIFAR-10, CIFAR-100, and ImageNet, respectively.

Originalsprog	Engelsk
Titel	Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
Antal sider	10
Forlag	IEEE
Publikationsdato	2022
Sider	11861-11870
ISBN (Trykt)	978-1-6654-6947-0
ISBN (Elektronisk)	978-1-6654-6946-3
DOI	https://doi.org/10.1109/CVPR52688.2022.01157
Status	Udgivet - 2022
Begivenhed	IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - New Orleans, USA Varighed: 18 jun. 2022 → 24 jun. 2022

Konference

Konference	IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Land/Område	USA
By	New Orleans
Periode	18/06/2022 → 24/06/2022

Navn	I E E E Conference on Computer Vision and Pattern Recognition. Proceedings
ISSN	1063-6919

Adgang til dokumentet

10.1109/CVPR52688.2022.01157

cvpr-22-zhangAccepteret manuskript, 1,08 MB

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

Link to publication in Scopus

Citationsformater

@inproceedings{ef6fc29567cf44a08dfa47965d7955d8,

title = "BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule",

abstract = "Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation. However, more recent works find that existing differentiable NAS techniques struggle to outperform naive baselines, yielding deteriorative architectures as the search proceeds. Rather than directly optimizing the architecture parameters, this paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions. By leveraging the natural-gradient variational inference (NGVI), the architecture distribution can be easily optimized based on existing codebases without incurring more memory and computational consumption. We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability. The experimental results on NAS benchmark datasets confirm the significant improvements the proposed framework can make. In addition, instead of simply applying the argmax on the learned parameters, we further leverage the recently-proposed training-free proxies in NAS to select the optimal architecture from a group architectures drawn from the optimized distribution, where we achieve state-of-the-art results on the NAS-Bench-201 and NAS-Bench-1shot1 benchmarks. Our best architecture in the DARTS search space also obtains competitive test errors with 2.37%, 15.72%, and 24.2% on CIFAR-10, CIFAR-100, and ImageNet, respectively.",

keywords = "Deep learning architectures and techniques, Optimization methods",

author = "Miao Zhang and Shirui Pan and Xiaojun Chang and Steven Su and Jilin Hu and Gholamreza Haffari and Bin Yang",

year = "2022",

doi = "10.1109/CVPR52688.2022.01157",

language = "English",

isbn = "978-1-6654-6947-0",

series = "I E E E Conference on Computer Vision and Pattern Recognition. Proceedings",

publisher = "IEEE",

pages = "11861--11870",

booktitle = "Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022",

address = "United States",

note = "IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ; Conference date: 18-06-2022 Through 24-06-2022",

}

Zhang, M, Pan, S, Chang, X, Su, S, Hu, J, Haffari, G & Yang, B 2022, BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule. i Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022. IEEE, I E E E Conference on Computer Vision and Pattern Recognition. Proceedings, s. 11861-11870, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, USA, 18/06/2022. https://doi.org/10.1109/CVPR52688.2022.01157

BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule. / Zhang, Miao; Pan, Shirui; Chang, Xiaojun et al.
Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022. IEEE, 2022. s. 11861-11870 (I E E E Conference on Computer Vision and Pattern Recognition. Proceedings).

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

TY - GEN

T1 - BaLeNAS

T2 - IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

AU - Zhang, Miao

AU - Pan, Shirui

AU - Chang, Xiaojun

AU - Su, Steven

AU - Hu, Jilin

AU - Haffari, Gholamreza

AU - Yang, Bin

PY - 2022

Y1 - 2022

N2 - Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation. However, more recent works find that existing differentiable NAS techniques struggle to outperform naive baselines, yielding deteriorative architectures as the search proceeds. Rather than directly optimizing the architecture parameters, this paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions. By leveraging the natural-gradient variational inference (NGVI), the architecture distribution can be easily optimized based on existing codebases without incurring more memory and computational consumption. We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability. The experimental results on NAS benchmark datasets confirm the significant improvements the proposed framework can make. In addition, instead of simply applying the argmax on the learned parameters, we further leverage the recently-proposed training-free proxies in NAS to select the optimal architecture from a group architectures drawn from the optimized distribution, where we achieve state-of-the-art results on the NAS-Bench-201 and NAS-Bench-1shot1 benchmarks. Our best architecture in the DARTS search space also obtains competitive test errors with 2.37%, 15.72%, and 24.2% on CIFAR-10, CIFAR-100, and ImageNet, respectively.

AB - Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost through weight sharing and continuous relaxation. However, more recent works find that existing differentiable NAS techniques struggle to outperform naive baselines, yielding deteriorative architectures as the search proceeds. Rather than directly optimizing the architecture parameters, this paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions. By leveraging the natural-gradient variational inference (NGVI), the architecture distribution can be easily optimized based on existing codebases without incurring more memory and computational consumption. We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability. The experimental results on NAS benchmark datasets confirm the significant improvements the proposed framework can make. In addition, instead of simply applying the argmax on the learned parameters, we further leverage the recently-proposed training-free proxies in NAS to select the optimal architecture from a group architectures drawn from the optimized distribution, where we achieve state-of-the-art results on the NAS-Bench-201 and NAS-Bench-1shot1 benchmarks. Our best architecture in the DARTS search space also obtains competitive test errors with 2.37%, 15.72%, and 24.2% on CIFAR-10, CIFAR-100, and ImageNet, respectively.

KW - Deep learning architectures and techniques

KW - Optimization methods

UR - http://www.scopus.com/inward/record.url?scp=85140201312&partnerID=8YFLogxK

U2 - 10.1109/CVPR52688.2022.01157

DO - 10.1109/CVPR52688.2022.01157

M3 - Article in proceeding

SN - 978-1-6654-6947-0

T3 - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings

SP - 11861

EP - 11870

BT - Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022

PB - IEEE

Y2 - 18 June 2022 through 24 June 2022

ER -

BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule

Abstract

Konference

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater