Beyond AUROC & co. for evaluating out-of-distribution detection performance

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

1 Citation (Scopus)
19 Downloads (Pure)

Abstract

While there has been a growing research interest in developing out-of-distribution (OOD) detection methods, there has been comparably little discussion around how these methods should be evaluated. Given their relevance for safe(r) AI, it is important to examine whether the basis for comparing OOD detection methods is consistent with practical needs. In this work, we take a closer look at the go-to metrics for evaluating OOD detection, and question the approach of exclusively reducing OOD detection to a binary classification task with little consideration for the detection threshold. We illustrate the limitations of current metrics (AUROC & its friends) and propose a new metric - Area Under the Threshold Curve (AUTC), which explicitly penalizes poor separation between ID and OOD samples. Scripts and data are available at https://github.com/glhr/beyond-auroc
Original languageEnglish
Title of host publication2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Number of pages10
PublisherIEEE (Institute of Electrical and Electronics Engineers)
Publication dateAug 2023
Pages3881-3890
ISBN (Electronic)979-8-3503-0249-3
DOIs
Publication statusPublished - Aug 2023
Event2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2023 - Vancouver, Canada
Duration: 18 Jun 202322 Jun 2023

Conference

Conference2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2023
Country/TerritoryCanada
CityVancouver
Period18/06/202322/06/2023
SeriesIEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
ISSN2160-7516

Fingerprint

Dive into the research topics of 'Beyond AUROC & co. for evaluating out-of-distribution detection performance'. Together they form a unique fingerprint.

Cite this