Human Pose Estimation and Activity Recognition from Multi-View Videos: Comparative Explorations of Recent Developments

Michael Boelstoft Holte, Cuong Tran, Mohan Trivedi, Thomas B. Moeslund

Research output: Contribution to journalJournal articleResearchpeer-review

75 Citations (Scopus)

Abstract

This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.
Original languageEnglish
JournalIEEE Journal of Selected Topics in Signal Processing
Volume6
Issue number5
Pages (from-to)538 - 552
Number of pages15
ISSN1932-4553
DOIs
Publication statusPublished - 2012

Fingerprint

Physical therapy
Sports
Animation
Cameras
Motion analysis
Assisted living

Cite this

@article{31a47b17e76a4fe0802202adfbb406c2,
title = "Human Pose Estimation and Activity Recognition from Multi-View Videos: Comparative Explorations of Recent Developments",
abstract = "This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.",
author = "Holte, {Michael Boelstoft} and Cuong Tran and Mohan Trivedi and Moeslund, {Thomas B.}",
year = "2012",
doi = "10.1109/JSTSP.2012.2196975",
language = "English",
volume = "6",
pages = "538 -- 552",
journal = "I E E E Journal on Selected Topics in Signal Processing",
issn = "1932-4553",
publisher = "IEEE",
number = "5",

}

Human Pose Estimation and Activity Recognition from Multi-View Videos : Comparative Explorations of Recent Developments. / Holte, Michael Boelstoft; Tran, Cuong ; Trivedi, Mohan; Moeslund, Thomas B.

In: IEEE Journal of Selected Topics in Signal Processing, Vol. 6, No. 5, 2012, p. 538 - 552.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Human Pose Estimation and Activity Recognition from Multi-View Videos

T2 - Comparative Explorations of Recent Developments

AU - Holte, Michael Boelstoft

AU - Tran, Cuong

AU - Trivedi, Mohan

AU - Moeslund, Thomas B.

PY - 2012

Y1 - 2012

N2 - This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.

AB - This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.

UR - http://www.scopus.com/inward/record.url?scp=84865384447&partnerID=8YFLogxK

U2 - 10.1109/JSTSP.2012.2196975

DO - 10.1109/JSTSP.2012.2196975

M3 - Journal article

VL - 6

SP - 538

EP - 552

JO - I E E E Journal on Selected Topics in Signal Processing

JF - I E E E Journal on Selected Topics in Signal Processing

SN - 1932-4553

IS - 5

ER -