Automatic Video-based Analysis of Human Motion

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

Abstract

The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has received great interest from both industry and research communities.

The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation.

Foreground segmentation is often the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling foreground camouflage, shadows, and moving backgrounds. The method continuously updates the background model to maintain high quality segmentation over long periods of time.

Within action recognition the thesis presents work on both recognition of arm gestures and gait types. A key-frame based approach is presented to recognize arm gestures. The method extracts a set of characteristic poses and describes them by their local motion resulting in motion primitives. A probabilistic edit distance is used to classify a sequence of motion primitives as a gesture. This 2D recognition process is extended into a view-invariant recognition of arm gestures by use of a range camera that generates 3D data and allows for a 3D equivalent of motion primitives. The recognition of gait types takes a different approach and extracts silhouettes that are matched against a database. A gait continuum is introduced to better describe the whole range of gait which deals with an inherent ambiguity of gait types.

Human pose estimation does not target a specific action but is considered as a good basis for the recognition of any action. The pose estimation work presented in this thesis is mainly concerned with the problems of interacting people and the complex occlusions that interactions produce. A pose estimation method based on the pictorial structures framework is presented. Body part detection combines edge and appearance information in a dynamic way. Occluded body parts are detected by pruning the foreground mask into a mask of possible occlusions. A multi-view approach to pose estimation is also presented that integrates low level information from different cameras to generate better pose estimates during heavy occlusions.

The works presented in this thesis contribute in these different areas of video-based analysis of human motion and altogether bring the solution of fully automatic analysis and understanding of human motion closer.
Luk

Detaljer

The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has received great interest from both industry and research communities.

The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation.

Foreground segmentation is often the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling foreground camouflage, shadows, and moving backgrounds. The method continuously updates the background model to maintain high quality segmentation over long periods of time.

Within action recognition the thesis presents work on both recognition of arm gestures and gait types. A key-frame based approach is presented to recognize arm gestures. The method extracts a set of characteristic poses and describes them by their local motion resulting in motion primitives. A probabilistic edit distance is used to classify a sequence of motion primitives as a gesture. This 2D recognition process is extended into a view-invariant recognition of arm gestures by use of a range camera that generates 3D data and allows for a 3D equivalent of motion primitives. The recognition of gait types takes a different approach and extracts silhouettes that are matched against a database. A gait continuum is introduced to better describe the whole range of gait which deals with an inherent ambiguity of gait types.

Human pose estimation does not target a specific action but is considered as a good basis for the recognition of any action. The pose estimation work presented in this thesis is mainly concerned with the problems of interacting people and the complex occlusions that interactions produce. A pose estimation method based on the pictorial structures framework is presented. Body part detection combines edge and appearance information in a dynamic way. Occluded body parts are detected by pruning the foreground mask into a mask of possible occlusions. A multi-view approach to pose estimation is also presented that integrates low level information from different cameras to generate better pose estimates during heavy occlusions.

The works presented in this thesis contribute in these different areas of video-based analysis of human motion and altogether bring the solution of fully automatic analysis and understanding of human motion closer.
OriginalsprogEngelsk
ForlagAalborg Universitet
Antal sider171
ISBN (Trykt)978-87-992732-4-9
StatusUdgivet - 15 okt. 2011
PublikationsartForskning

Download-statistik

Ingen data tilgængelig
ID: 56062779