Understanding Behaviors in Videos through Behavior-Specific Dictionaries

Huamin Ren, Weifeng Liu, Søren Ingvor Olsen, Sergio Escalera, Thomas B. Moeslund

Research output: Working paper/PreprintWorking paperResearch


Understanding behaviors is the core of video content analysis, which is highly
related to two important applications: abnormal event detection and action recognition. Dictionary learning, as one of the mid-level representations, is an
important step to process a video. It has achieved state-of-the-art performance
in both applications, and gained more and more attention since its success.
Despite the progress of dictionary learning for these two applications, a dictionary built for one task is hard to apply on the other, which not only hampers the applicability of the algorithm, but also fails to meet scalability needs: A dictionary aimed at an abnormality detection purpose may misdetect normal behavior, which rarely happens in training datasets even though it may be very common in daily life. In contrast, a dictionary aimed at action recognition may misclassify a newcoming action category as an existing action. Therefore, our Behavior-Specific Dictionaries (BSDs) are constructed to solve these two applications through a unified framework. To the best of our knowledge, this is the first generalized dictionary algorithm that successfully handle with action recognition and abnormality detection. For abnormality detection, our BSD algorithm outperforms the state-of-the-art methods on the UCSD dataset using strict pixel-level evaluation by 10% AUC promotion. On the AAU Anomaly Stairs Dataset, we achieve a state-of-the-art average true positive rate, while obtaining the lowest average false positive rate. On action recognition, our BSD algorithm outperforms state-of-the-art on the UCF YouTube Action dataset. Furthermore, we achieve competitive results on the UCF50 dataset.
Original languageEnglish
Number of pages58
Publication statusUnpublished - 2019


Dive into the research topics of 'Understanding Behaviors in Videos through Behavior-Specific Dictionaries'. Together they form a unique fingerprint.

Cite this