Automatic Video-based Analysis of Human Motion

Preben Fihl

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

1734 Downloads (Pure)

Resumé

The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has received great interest from both industry and research communities.

The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation.

Foreground segmentation is often the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling foreground camouflage, shadows, and moving backgrounds. The method continuously updates the background model to maintain high quality segmentation over long periods of time.

Within action recognition the thesis presents work on both recognition of arm gestures and gait types. A key-frame based approach is presented to recognize arm gestures. The method extracts a set of characteristic poses and describes them by their local motion resulting in motion primitives. A probabilistic edit distance is used to classify a sequence of motion primitives as a gesture. This 2D recognition process is extended into a view-invariant recognition of arm gestures by use of a range camera that generates 3D data and allows for a 3D equivalent of motion primitives. The recognition of gait types takes a different approach and extracts silhouettes that are matched against a database. A gait continuum is introduced to better describe the whole range of gait which deals with an inherent ambiguity of gait types.

Human pose estimation does not target a specific action but is considered as a good basis for the recognition of any action. The pose estimation work presented in this thesis is mainly concerned with the problems of interacting people and the complex occlusions that interactions produce. A pose estimation method based on the pictorial structures framework is presented. Body part detection combines edge and appearance information in a dynamic way. Occluded body parts are detected by pruning the foreground mask into a mask of possible occlusions. A multi-view approach to pose estimation is also presented that integrates low level information from different cameras to generate better pose estimates during heavy occlusions.

The works presented in this thesis contribute in these different areas of video-based analysis of human motion and altogether bring the solution of fully automatic analysis and understanding of human motion closer.
OriginalsprogEngelsk
ForlagAalborg Universitet
Antal sider171
ISBN (Trykt)978-87-992732-4-9
StatusUdgivet - 15 okt. 2011

Fingeraftryk

Masks
Cameras
Camouflage
Industry

Emneord

    Citer dette

    Fihl, P. (2011). Automatic Video-based Analysis of Human Motion. Aalborg Universitet.
    Fihl, Preben. / Automatic Video-based Analysis of Human Motion. Aalborg Universitet, 2011. 171 s.
    @phdthesis{34475a1b9d65488189c951f3d6161cd9,
    title = "Automatic Video-based Analysis of Human Motion",
    abstract = "The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has received great interest from both industry and research communities.The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation.Foreground segmentation is often the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling foreground camouflage, shadows, and moving backgrounds. The method continuously updates the background model to maintain high quality segmentation over long periods of time.Within action recognition the thesis presents work on both recognition of arm gestures and gait types. A key-frame based approach is presented to recognize arm gestures. The method extracts a set of characteristic poses and describes them by their local motion resulting in motion primitives. A probabilistic edit distance is used to classify a sequence of motion primitives as a gesture. This 2D recognition process is extended into a view-invariant recognition of arm gestures by use of a range camera that generates 3D data and allows for a 3D equivalent of motion primitives. The recognition of gait types takes a different approach and extracts silhouettes that are matched against a database. A gait continuum is introduced to better describe the whole range of gait which deals with an inherent ambiguity of gait types.Human pose estimation does not target a specific action but is considered as a good basis for the recognition of any action. The pose estimation work presented in this thesis is mainly concerned with the problems of interacting people and the complex occlusions that interactions produce. A pose estimation method based on the pictorial structures framework is presented. Body part detection combines edge and appearance information in a dynamic way. Occluded body parts are detected by pruning the foreground mask into a mask of possible occlusions. A multi-view approach to pose estimation is also presented that integrates low level information from different cameras to generate better pose estimates during heavy occlusions.The works presented in this thesis contribute in these different areas of video-based analysis of human motion and altogether bring the solution of fully automatic analysis and understanding of human motion closer.",
    keywords = "Human motion, Computer vision, Foreground segmentation, Video surveillance, Human pose estimation, Action recognition",
    author = "Preben Fihl",
    year = "2011",
    month = "10",
    day = "15",
    language = "English",
    isbn = "978-87-992732-4-9",
    publisher = "Aalborg Universitet",

    }

    Fihl, P 2011, Automatic Video-based Analysis of Human Motion. Aalborg Universitet.

    Automatic Video-based Analysis of Human Motion. / Fihl, Preben.

    Aalborg Universitet, 2011. 171 s.

    Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

    TY - BOOK

    T1 - Automatic Video-based Analysis of Human Motion

    AU - Fihl, Preben

    PY - 2011/10/15

    Y1 - 2011/10/15

    N2 - The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has received great interest from both industry and research communities.The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation.Foreground segmentation is often the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling foreground camouflage, shadows, and moving backgrounds. The method continuously updates the background model to maintain high quality segmentation over long periods of time.Within action recognition the thesis presents work on both recognition of arm gestures and gait types. A key-frame based approach is presented to recognize arm gestures. The method extracts a set of characteristic poses and describes them by their local motion resulting in motion primitives. A probabilistic edit distance is used to classify a sequence of motion primitives as a gesture. This 2D recognition process is extended into a view-invariant recognition of arm gestures by use of a range camera that generates 3D data and allows for a 3D equivalent of motion primitives. The recognition of gait types takes a different approach and extracts silhouettes that are matched against a database. A gait continuum is introduced to better describe the whole range of gait which deals with an inherent ambiguity of gait types.Human pose estimation does not target a specific action but is considered as a good basis for the recognition of any action. The pose estimation work presented in this thesis is mainly concerned with the problems of interacting people and the complex occlusions that interactions produce. A pose estimation method based on the pictorial structures framework is presented. Body part detection combines edge and appearance information in a dynamic way. Occluded body parts are detected by pruning the foreground mask into a mask of possible occlusions. A multi-view approach to pose estimation is also presented that integrates low level information from different cameras to generate better pose estimates during heavy occlusions.The works presented in this thesis contribute in these different areas of video-based analysis of human motion and altogether bring the solution of fully automatic analysis and understanding of human motion closer.

    AB - The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has received great interest from both industry and research communities.The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation.Foreground segmentation is often the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling foreground camouflage, shadows, and moving backgrounds. The method continuously updates the background model to maintain high quality segmentation over long periods of time.Within action recognition the thesis presents work on both recognition of arm gestures and gait types. A key-frame based approach is presented to recognize arm gestures. The method extracts a set of characteristic poses and describes them by their local motion resulting in motion primitives. A probabilistic edit distance is used to classify a sequence of motion primitives as a gesture. This 2D recognition process is extended into a view-invariant recognition of arm gestures by use of a range camera that generates 3D data and allows for a 3D equivalent of motion primitives. The recognition of gait types takes a different approach and extracts silhouettes that are matched against a database. A gait continuum is introduced to better describe the whole range of gait which deals with an inherent ambiguity of gait types.Human pose estimation does not target a specific action but is considered as a good basis for the recognition of any action. The pose estimation work presented in this thesis is mainly concerned with the problems of interacting people and the complex occlusions that interactions produce. A pose estimation method based on the pictorial structures framework is presented. Body part detection combines edge and appearance information in a dynamic way. Occluded body parts are detected by pruning the foreground mask into a mask of possible occlusions. A multi-view approach to pose estimation is also presented that integrates low level information from different cameras to generate better pose estimates during heavy occlusions.The works presented in this thesis contribute in these different areas of video-based analysis of human motion and altogether bring the solution of fully automatic analysis and understanding of human motion closer.

    KW - Human motion

    KW - Computer vision

    KW - Foreground segmentation

    KW - Video surveillance

    KW - Human pose estimation

    KW - Action recognition

    M3 - Ph.D. thesis

    SN - 978-87-992732-4-9

    BT - Automatic Video-based Analysis of Human Motion

    PB - Aalborg Universitet

    ER -

    Fihl P. Automatic Video-based Analysis of Human Motion. Aalborg Universitet, 2011. 171 s.