@article{5d8328c754a3435fae5544278142c6e2,
title = "Codabench: Flexible, easy-to-use, and reproducible meta-benchmark platform",
abstract = "Obtaining a standardized benchmark of computational methods is a major issue in data-science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here, we introduce Codabench, a meta-benchmark platform that is open sourced and community driven for benchmarking algorithms or software agents versus datasets or tasks. A public instance of Codabench is open to everyone free of charge and allows benchmark organizers to fairly compare submissions under the same setting (software, hardware, data, algorithms), with custom protocols and data formats. Codabench has unique features facilitating easy organization of flexible and reproducible benchmarks, such as the possibility of reusing templates of benchmarks and supplying compute resources on demand. Codabench has been used internally and externally on various applications, receiving more than 130 users and 2,500 submissions. As illustrative use cases, we introduce four diverse benchmarks covering graph machine learning, cancer heterogeneity, clinical diagnosis, and reinforcement learning.",
keywords = "benchmark platform, competitions, data science, DSML3: Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems, machine learning, reproducibility",
author = "Zhen Xu and Sergio Escalera and Adrien Pav{\~a}o and Magali Richard and Tu, {Wei Wei} and Quanming Yao and Huan Zhao and Isabelle Guyon",
note = "Funding Information: The Codabench project shares the same community governance as CodaLab Competitions. The openness of Codabench is total: an Apache 2.0 license is used, the source code is on GitHub, and the development framework and all the used components are open sourced. Codabench has received important contributions from many people who did not co-author this paper, and we would like to thank their efforts in making Codabench what it is today, including early CodaLab Competitions developers and contributors (listed alphabetically): Pujun Bhatnagar, Justin Carden, Richard Caruana, Francis Cleary, Xiawei Guo, Ivan Judson, Lori Ada Kilty, Shaunak Kishore, Stephen Koo, Percy Liang, Zhengying Liu, Pragnya Maduskar, Simon Mercer, Arthur Pesah, Christophe Poulain, Lukasz Romaszko, Laurent Senta, Lisheng Sun, Sebastien Treguer, Cedric Vachaudez, Evelyne Viegas, Paul Viola, Erick Watson, Tony Yang, Flavio Zhingri, and Michael Zyskowski. We would like to particularly thank the people who contributed to the design, development, and testing of Codabench including (listed alphabetically) Alexis Arnaud, Xavier Bar{\'o}, Feng Bin, Yuna Blum, Eric Carmichael, Laurent Darr{\'e}, Hugo Jair Escalante, Sergio Escalera, Eric Frichot, Yuxuan He, James Keith, Anne-Catherine Letournel, Shouxiang Liu, Zhenwu Liu, Adrien Pavao, Magali Richard, Tyler Thomas, Nic Threfts, Bailey Trefts, Catherine Wallez, and Lanning Wei. The Universit{\'e} Paris-Saclay is hosting the main instance of Codabench. Funding and support have been received via several research grants, including Big Data Chair of Excellence FDS Paris-Saclay, Paris R{\'e}gion Ile-de-France, EU EIT projects HADACA and COMETH, United Health Foundation INCITE project, ANR Chair of Artificial Intelligence HUMANIA ANR-19-CHIA-0022, the Spanish project PID2019-105093GB-I00, ICREA under the ICREA Academia program, INSERM Cancer project ACACIA 232717, MIAI @Grenoble Alpes (ANR-19-P3IA-0003), 4Paradigm, ChaLearn, Microsoft, and Google. We also appreciate the following people and institutes for their open-source datasets that are used in our use cases: Andrew McCallum, C. Lee Giles, Ken Lang, Tom Mitchell, William L. Hamilton, Maximilian Mumme, Oleksandr Shchur, David D. Lewis, William Hersh, Just Research and Carnegie Mellon University, NEC Research Institute, Carnegie Mellon University, Stanford University, Technical University of Munich, AT&T Labs, and Oregon Health Sciences University. We are also very grateful to Joaquin Vanschoren for fruitful discussions. Conceptualization, Z.X. S.E. A.P. and I.G.; methodology, Z.X. and I.G.; validation and investigation, all authors; resources and data curation, Z.X. M.R. W.-W.T. and I.G.; writing – original draft, all authors; writing – review & editing, Z.X. Q.Y. M.R. and I.G.; visualization, Z.X. Q.Y. and I.G.; supervision and project administration, I.G.; funding acquisition, W.-W.T. and I.G. Z.X. W.-W.T. and H.Z. are employed by 4Paradigm, China. I.G. is president of ChaLearn, a not-for-profit organization dedicated to running challenges in machine learning. ChaLearn is a tax-exempt not-for-profit organization under section 501(c)(3) of the US IRS code of the United States. It derived no profit from sponsoring this research. Funding Information: The Codabench project shares the same community governance as CodaLab Competitions. The openness of Codabench is total: an Apache 2.0 license is used, the source code is on GitHub, and the development framework and all the used components are open sourced. Codabench has received important contributions from many people who did not co-author this paper, and we would like to thank their efforts in making Codabench what it is today, including early CodaLab Competitions developers and contributors (listed alphabetically): Pujun Bhatnagar, Justin Carden, Richard Caruana, Francis Cleary, Xiawei Guo, Ivan Judson, Lori Ada Kilty, Shaunak Kishore, Stephen Koo, Percy Liang, Zhengying Liu, Pragnya Maduskar, Simon Mercer, Arthur Pesah, Christophe Poulain, Lukasz Romaszko, Laurent Senta, Lisheng Sun, Sebastien Treguer, Cedric Vachaudez, Evelyne Viegas, Paul Viola, Erick Watson, Tony Yang, Flavio Zhingri, and Michael Zyskowski. We would like to particularly thank the people who contributed to the design, development, and testing of Codabench including (listed alphabetically) Alexis Arnaud, Xavier Bar{\'o}, Feng Bin, Yuna Blum, Eric Carmichael, Laurent Darr{\'e}, Hugo Jair Escalante, Sergio Escalera, Eric Frichot, Yuxuan He, James Keith, Anne-Catherine Letournel, Shouxiang Liu, Zhenwu Liu, Adrien Pavao, Magali Richard, Tyler Thomas, Nic Threfts, Bailey Trefts, Catherine Wallez, and Lanning Wei. The Universit{\'e} Paris-Saclay is hosting the main instance of Codabench. Funding and support have been received via several research grants, including Big Data Chair of Excellence FDS Paris-Saclay , Paris R{\'e}gion Ile-de-France , EU EIT projects HADACA and COMETH, United Health Foundation INCITE project, ANR Chair of Artificial Intelligence HUMANIA ANR-19-CHIA-0022 , the Spanish project PID2019-105093GB-I00 , ICREA under the ICREA Academia program, INSERM Cancer project ACACIA 232717 , MIAI @Grenoble Alpes ( ANR-19-P3IA-0003 ), 4Paradigm , ChaLearn , Microsoft , and Google . We also appreciate the following people and institutes for their open-source datasets that are used in our use cases: Andrew McCallum, C. Lee Giles, Ken Lang, Tom Mitchell, William L. Hamilton, Maximilian Mumme, Oleksandr Shchur, David D. Lewis, William Hersh, Just Research and Carnegie Mellon University, NEC Research Institute, Carnegie Mellon University, Stanford University, Technical University of Munich, AT&T Labs, and Oregon Health Sciences University. We are also very grateful to Joaquin Vanschoren for fruitful discussions. Publisher Copyright: {\textcopyright} 2022 The Authors {\textcopyright} 2022 The Authors.",
year = "2022",
month = jul,
day = "8",
doi = "10.1016/j.patter.2022.100543",
language = "English",
volume = "3",
journal = "Patterns",
issn = "2666-3899",
publisher = "Cell Press",
number = "7",
}