Understanding Behaviors in Videos through Behavior-Specific Dictionaries

Huamin Ren, Weifeng Liu, Søren Ingvor Olsen, Sergio Escalera, Thomas B. Moeslund

Research output: Contribution to journalReview articleResearchpeer-review

Abstract

Understanding behaviors is the core of video content analysis, which is highly
related to two important applications: abnormal event detection and action recognition. Dictionary learning, as one of the mid-level representations, is an
important step to process a video. It has achieved state-of-the-art performance
in both applications, and gained more and more attention since its success.
Despite the progress of dictionary learning for these two applications, a dictionary built for one task is hard to apply on the other, which not only hampers the applicability of the algorithm, but also fails to meet scalability needs: A dictionary aimed at an abnormality detection purpose may misdetect normal behavior, which rarely happens in training datasets even though it may be very common in daily life. In contrast, a dictionary aimed at action recognition may misclassify a newcoming action category as an existing action. Therefore, our Behavior-Specific Dictionaries (BSDs) are constructed to solve these two applications through a unified framework. To the best of our knowledge, this is the first generalized dictionary algorithm that successfully handle with action recognition and abnormality detection. For abnormality detection, our BSD algorithm outperforms the state-of-the-art methods on the UCSD dataset using strict pixel-level evaluation by 10% AUC promotion. On the AAU Anomaly Stairs Dataset, we achieve a state-of-the-art average true positive rate, while obtaining the lowest average false positive rate. On action recognition, our BSD algorithm outperforms state-of-the-art on the UCF YouTube Action dataset. Furthermore, we achieve competitive results on the UCF50 dataset.
Original languageEnglish
JournalComputer Vision and Image Understanding
Pages (from-to)1-58
Number of pages58
ISSN1077-3142
Publication statusAccepted/In press - 2019

Fingerprint

Glossaries
Stairs
Scalability
Pixels

Cite this

Ren, Huamin ; Liu, Weifeng ; Olsen, Søren Ingvor ; Escalera, Sergio ; Moeslund, Thomas B. / Understanding Behaviors in Videos through Behavior-Specific Dictionaries. In: Computer Vision and Image Understanding. 2019 ; pp. 1-58.
@article{eb62c6cf42f74734b599818d185c4fea,
title = "Understanding Behaviors in Videos through Behavior-Specific Dictionaries",
abstract = "Understanding behaviors is the core of video content analysis, which is highlyrelated to two important applications: abnormal event detection and action recognition. Dictionary learning, as one of the mid-level representations, is animportant step to process a video. It has achieved state-of-the-art performancein both applications, and gained more and more attention since its success.Despite the progress of dictionary learning for these two applications, a dictionary built for one task is hard to apply on the other, which not only hampers the applicability of the algorithm, but also fails to meet scalability needs: A dictionary aimed at an abnormality detection purpose may misdetect normal behavior, which rarely happens in training datasets even though it may be very common in daily life. In contrast, a dictionary aimed at action recognition may misclassify a newcoming action category as an existing action. Therefore, our Behavior-Specific Dictionaries (BSDs) are constructed to solve these two applications through a unified framework. To the best of our knowledge, this is the first generalized dictionary algorithm that successfully handle with action recognition and abnormality detection. For abnormality detection, our BSD algorithm outperforms the state-of-the-art methods on the UCSD dataset using strict pixel-level evaluation by 10{\%} AUC promotion. On the AAU Anomaly Stairs Dataset, we achieve a state-of-the-art average true positive rate, while obtaining the lowest average false positive rate. On action recognition, our BSD algorithm outperforms state-of-the-art on the UCF YouTube Action dataset. Furthermore, we achieve competitive results on the UCF50 dataset.",
author = "Huamin Ren and Weifeng Liu and Olsen, {S{\o}ren Ingvor} and Sergio Escalera and Moeslund, {Thomas B.}",
year = "2019",
language = "English",
pages = "1--58",
journal = "Computer Vision and Image Understanding",
issn = "1077-3142",
publisher = "Academic Press",

}

Understanding Behaviors in Videos through Behavior-Specific Dictionaries. / Ren, Huamin; Liu, Weifeng; Olsen, Søren Ingvor; Escalera, Sergio; Moeslund, Thomas B.

In: Computer Vision and Image Understanding, 2019, p. 1-58.

Research output: Contribution to journalReview articleResearchpeer-review

TY - JOUR

T1 - Understanding Behaviors in Videos through Behavior-Specific Dictionaries

AU - Ren, Huamin

AU - Liu, Weifeng

AU - Olsen, Søren Ingvor

AU - Escalera, Sergio

AU - Moeslund, Thomas B.

PY - 2019

Y1 - 2019

N2 - Understanding behaviors is the core of video content analysis, which is highlyrelated to two important applications: abnormal event detection and action recognition. Dictionary learning, as one of the mid-level representations, is animportant step to process a video. It has achieved state-of-the-art performancein both applications, and gained more and more attention since its success.Despite the progress of dictionary learning for these two applications, a dictionary built for one task is hard to apply on the other, which not only hampers the applicability of the algorithm, but also fails to meet scalability needs: A dictionary aimed at an abnormality detection purpose may misdetect normal behavior, which rarely happens in training datasets even though it may be very common in daily life. In contrast, a dictionary aimed at action recognition may misclassify a newcoming action category as an existing action. Therefore, our Behavior-Specific Dictionaries (BSDs) are constructed to solve these two applications through a unified framework. To the best of our knowledge, this is the first generalized dictionary algorithm that successfully handle with action recognition and abnormality detection. For abnormality detection, our BSD algorithm outperforms the state-of-the-art methods on the UCSD dataset using strict pixel-level evaluation by 10% AUC promotion. On the AAU Anomaly Stairs Dataset, we achieve a state-of-the-art average true positive rate, while obtaining the lowest average false positive rate. On action recognition, our BSD algorithm outperforms state-of-the-art on the UCF YouTube Action dataset. Furthermore, we achieve competitive results on the UCF50 dataset.

AB - Understanding behaviors is the core of video content analysis, which is highlyrelated to two important applications: abnormal event detection and action recognition. Dictionary learning, as one of the mid-level representations, is animportant step to process a video. It has achieved state-of-the-art performancein both applications, and gained more and more attention since its success.Despite the progress of dictionary learning for these two applications, a dictionary built for one task is hard to apply on the other, which not only hampers the applicability of the algorithm, but also fails to meet scalability needs: A dictionary aimed at an abnormality detection purpose may misdetect normal behavior, which rarely happens in training datasets even though it may be very common in daily life. In contrast, a dictionary aimed at action recognition may misclassify a newcoming action category as an existing action. Therefore, our Behavior-Specific Dictionaries (BSDs) are constructed to solve these two applications through a unified framework. To the best of our knowledge, this is the first generalized dictionary algorithm that successfully handle with action recognition and abnormality detection. For abnormality detection, our BSD algorithm outperforms the state-of-the-art methods on the UCSD dataset using strict pixel-level evaluation by 10% AUC promotion. On the AAU Anomaly Stairs Dataset, we achieve a state-of-the-art average true positive rate, while obtaining the lowest average false positive rate. On action recognition, our BSD algorithm outperforms state-of-the-art on the UCF YouTube Action dataset. Furthermore, we achieve competitive results on the UCF50 dataset.

M3 - Review article

SP - 1

EP - 58

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

SN - 1077-3142

ER -