Human Pose Estimation and Activity Recognition from Multi-View Videos: Comparative Explorations of Recent Developments

Michael Boelstoft Holte, Cuong Tran, Mohan Trivedi, Thomas B. Moeslund

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

75 Citationer (Scopus)

Resumé

This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.
OriginalsprogEngelsk
TidsskriftIEEE Journal of Selected Topics in Signal Processing
Vol/bind6
Udgave nummer5
Sider (fra-til)538 - 552
Antal sider15
ISSN1932-4553
DOI
StatusUdgivet - 2012

Fingerprint

Physical therapy
Sports
Animation
Cameras
Motion analysis
Assisted living

Citer dette

@article{31a47b17e76a4fe0802202adfbb406c2,
title = "Human Pose Estimation and Activity Recognition from Multi-View Videos: Comparative Explorations of Recent Developments",
abstract = "This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.",
author = "Holte, {Michael Boelstoft} and Cuong Tran and Mohan Trivedi and Moeslund, {Thomas B.}",
year = "2012",
doi = "10.1109/JSTSP.2012.2196975",
language = "English",
volume = "6",
pages = "538 -- 552",
journal = "I E E E Journal on Selected Topics in Signal Processing",
issn = "1932-4553",
publisher = "IEEE",
number = "5",

}

Human Pose Estimation and Activity Recognition from Multi-View Videos : Comparative Explorations of Recent Developments. / Holte, Michael Boelstoft; Tran, Cuong ; Trivedi, Mohan; Moeslund, Thomas B.

I: IEEE Journal of Selected Topics in Signal Processing, Bind 6, Nr. 5, 2012, s. 538 - 552.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Human Pose Estimation and Activity Recognition from Multi-View Videos

T2 - Comparative Explorations of Recent Developments

AU - Holte, Michael Boelstoft

AU - Tran, Cuong

AU - Trivedi, Mohan

AU - Moeslund, Thomas B.

PY - 2012

Y1 - 2012

N2 - This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.

AB - This paper presents a review and comparative study of recent multi-view approaches for human 3D pose estimation and activity recognition. We discuss the application domain of human pose estimation and activity recognition and the associated requirements, covering: advanced human–computer interaction (HCI), assisted living, gesture-based interactive games, intelligent driver assistance systems, movies, 3D TV and animation, physical therapy, autonomous mental development, smart environments, sport motion analysis, video surveillance, and video annotation. Next, we review and categorize recent approaches which have been proposed to comply with these requirements. We report a comparison of the most promising methods for multi-view human action recognition using two publicly available datasets: the INRIA Xmas Motion Acquisition Sequences (IXMAS) Multi-View Human Action Dataset, and the i3DPost Multi-View Human Action and Interaction Dataset. To compare the proposed methods, we give a qualitative assessment of methods which cannot be compared quantitatively, and analyze some prominent 3D pose estimation techniques for application, where not only the performed action needs to be identified but a more detailed description of the body pose and joint configuration. Finally, we discuss some of the shortcomings of multi-view camera setups and outline our thoughts on future directions of 3D body pose estimation and human action recognition.

UR - http://www.scopus.com/inward/record.url?scp=84865384447&partnerID=8YFLogxK

U2 - 10.1109/JSTSP.2012.2196975

DO - 10.1109/JSTSP.2012.2196975

M3 - Journal article

VL - 6

SP - 538

EP - 552

JO - I E E E Journal on Selected Topics in Signal Processing

JF - I E E E Journal on Selected Topics in Signal Processing

SN - 1932-4553

IS - 5

ER -