Self-supervised masked convolutional transformer block for anomaly detection

Neelu Madan; Nicolae Catalin Ristea; Radu Tudor Ionescu; Kamal Nasrollahi; Fahad Shahbaz Khan; Thomas B. Moeslund; Mubarak Shah

doi:10.1109/TPAMI.2023.3322604

Self-supervised masked convolutional transformer block for anomaly detection

Neelu Madan, Nicolae Catalin Ristea, Radu Tudor Ionescu, Kamal Nasrollahi, Fahad Shahbaz Khan, Thomas B. Moeslund, Mubarak Shah

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

3 Citationer (Scopus)

Abstract

Anomaly detection has recently gained increasing attention in the field of computer vision, likely due to its broad set of applications ranging from product fault detection on industrial production lines and impending event detection in video surveillance to finding lesions in medical scans. Regardless of the domain, anomaly detection is typically framed as a one-class classification task, where the learning is conducted on normal examples only. An entire family of successful anomaly detection methods is based on learning to reconstruct masked normal inputs (e.g. patches, future frames, etc.) and exerting the magnitude of the reconstruction error as an indicator for the abnormality level. Unlike other reconstruction-based methods, we present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level. The proposed self-supervised block is extremely flexible, enabling information masking at any layer of a neural network and being compatible with a wide range of neural architectures. In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss. Furthermore, we show that our block is applicable to a wider variety of tasks, adding anomaly detection in medical images and thermal videos to the previously considered tasks based on RGB images and surveillance videos. We exhibit the generality and flexibility of SSMCTB by integrating it into multiple state-of-the-art neural models for anomaly detection, bringing forth empirical results that confirm considerable performance improvements on five benchmarks: MVTec AD, BRATS, Avenue, ShanghaiTech, and Thermal Rare Event.

Originalsprog	Engelsk
Tidsskrift	IEEE Transactions on Pattern Analysis and Machine Intelligence
Vol/bind	46
Udgave nummer	1
Sider (fra-til)	525-542
Antal sider	18
ISSN	0162-8828
DOI	https://doi.org/10.1109/TPAMI.2023.3322604
Status	Udgivet - 1 jan. 2024

Adgang til dokumentet

10.1109/TPAMI.2023.3322604

https://arxiv.org/pdf/2209.12148.pdf

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

Link to publication in Scopus

Citationsformater

@article{d7a0fca3f8304c9aafc99f8f85c1376f,

title = "Self-supervised masked convolutional transformer block for anomaly detection",

abstract = "Anomaly detection has recently gained increasing attention in the field of computer vision, likely due to its broad set of applications ranging from product fault detection on industrial production lines and impending event detection in video surveillance to finding lesions in medical scans. Regardless of the domain, anomaly detection is typically framed as a one-class classification task, where the learning is conducted on normal examples only. An entire family of successful anomaly detection methods is based on learning to reconstruct masked normal inputs (e.g. patches, future frames, etc.) and exerting the magnitude of the reconstruction error as an indicator for the abnormality level. Unlike other reconstruction-based methods, we present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level. The proposed self-supervised block is extremely flexible, enabling information masking at any layer of a neural network and being compatible with a wide range of neural architectures. In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss. Furthermore, we show that our block is applicable to a wider variety of tasks, adding anomaly detection in medical images and thermal videos to the previously considered tasks based on RGB images and surveillance videos. We exhibit the generality and flexibility of SSMCTB by integrating it into multiple state-of-the-art neural models for anomaly detection, bringing forth empirical results that confirm considerable performance improvements on five benchmarks: MVTec AD, BRATS, Avenue, ShanghaiTech, and Thermal Rare Event.",

keywords = "Anomaly detection, Benchmark testing, Convolution, Image reconstruction, Task analysis, Three-dimensional displays, Transformers, abnormal event detection, anomaly detection, attention mechanism, masked convolution, self-attention, self-supervised learning, transformer, Abnormal event detection",

author = "Neelu Madan and Ristea, {Nicolae Catalin} and Ionescu, {Radu Tudor} and Kamal Nasrollahi and Khan, {Fahad Shahbaz} and Moeslund, {Thomas B.} and Mubarak Shah",

year = "2024",

month = jan,

day = "1",

doi = "10.1109/TPAMI.2023.3322604",

language = "English",

volume = "46",

pages = "525--542",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE",

number = "1",

}

TY - JOUR

T1 - Self-supervised masked convolutional transformer block for anomaly detection

AU - Madan, Neelu

AU - Ristea, Nicolae Catalin

AU - Ionescu, Radu Tudor

AU - Nasrollahi, Kamal

AU - Khan, Fahad Shahbaz

AU - Moeslund, Thomas B.

AU - Shah, Mubarak

PY - 2024/1/1

Y1 - 2024/1/1

N2 - Anomaly detection has recently gained increasing attention in the field of computer vision, likely due to its broad set of applications ranging from product fault detection on industrial production lines and impending event detection in video surveillance to finding lesions in medical scans. Regardless of the domain, anomaly detection is typically framed as a one-class classification task, where the learning is conducted on normal examples only. An entire family of successful anomaly detection methods is based on learning to reconstruct masked normal inputs (e.g. patches, future frames, etc.) and exerting the magnitude of the reconstruction error as an indicator for the abnormality level. Unlike other reconstruction-based methods, we present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level. The proposed self-supervised block is extremely flexible, enabling information masking at any layer of a neural network and being compatible with a wide range of neural architectures. In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss. Furthermore, we show that our block is applicable to a wider variety of tasks, adding anomaly detection in medical images and thermal videos to the previously considered tasks based on RGB images and surveillance videos. We exhibit the generality and flexibility of SSMCTB by integrating it into multiple state-of-the-art neural models for anomaly detection, bringing forth empirical results that confirm considerable performance improvements on five benchmarks: MVTec AD, BRATS, Avenue, ShanghaiTech, and Thermal Rare Event.

AB - Anomaly detection has recently gained increasing attention in the field of computer vision, likely due to its broad set of applications ranging from product fault detection on industrial production lines and impending event detection in video surveillance to finding lesions in medical scans. Regardless of the domain, anomaly detection is typically framed as a one-class classification task, where the learning is conducted on normal examples only. An entire family of successful anomaly detection methods is based on learning to reconstruct masked normal inputs (e.g. patches, future frames, etc.) and exerting the magnitude of the reconstruction error as an indicator for the abnormality level. Unlike other reconstruction-based methods, we present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level. The proposed self-supervised block is extremely flexible, enabling information masking at any layer of a neural network and being compatible with a wide range of neural architectures. In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss. Furthermore, we show that our block is applicable to a wider variety of tasks, adding anomaly detection in medical images and thermal videos to the previously considered tasks based on RGB images and surveillance videos. We exhibit the generality and flexibility of SSMCTB by integrating it into multiple state-of-the-art neural models for anomaly detection, bringing forth empirical results that confirm considerable performance improvements on five benchmarks: MVTec AD, BRATS, Avenue, ShanghaiTech, and Thermal Rare Event.

KW - Anomaly detection

KW - Benchmark testing

KW - Convolution

KW - Image reconstruction

KW - Task analysis

KW - Three-dimensional displays

KW - Transformers

KW - abnormal event detection

KW - anomaly detection

KW - attention mechanism

KW - masked convolution

KW - self-attention

KW - self-supervised learning

KW - transformer

KW - Abnormal event detection

UR - http://www.scopus.com/inward/record.url?scp=85174839196&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2023.3322604

DO - 10.1109/TPAMI.2023.3322604

M3 - Journal article

SN - 0162-8828

VL - 46

SP - 525

EP - 542

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

IS - 1

ER -

Self-supervised masked convolutional transformer block for anomaly detection

Abstract

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater