TY - JOUR
T1 - Segmentation of RGB-D indoor scenes by stacking random forests and conditional random fields
AU - Thøgersen, Mikkel
AU - Guerrero, Sergio Escalera
AU - Gonzàlez, Jordi
AU - Moeslund, Thomas B.
PY - 2016
Y1 - 2016
N2 - Depth images have granted new possibilities to computer vision researchers across the field. A prominent task is scene understanding and segmentation on which the present work is concerned. In this paper, we present a procedure combining well known methods in a unified learning framework based on stacked classifiers; the benefits are two fold: on one hand, the system scales well to consider different types of complex features and, on the other hand, the use of stacked classifiers makes the performance of the proposed technique more accurate. The proposed method consists of a random forest using random offset features in combination with a conditional random field (CRF) acting on a simple linear iterative clustering (SLIC) superpixel segmentation. The predictions of the CRF are filtered spatially by a multi-scale decomposition before merging it with the original feature set and applying a stacked random forest which gives the final predictions. The model is tested on the renown NYU-v2 dataset and the recently available SUNRGBD dataset. The approach shows that simple multimodal features with the power of using multi-class multi-scale stacked sequential learners (MMSSL) can achieve slight better performance than state of the art methods on the same dataset. The results show an improvement of 2.3% over the base model by using MMSSL and displays that the method is effective in this problem domain.
AB - Depth images have granted new possibilities to computer vision researchers across the field. A prominent task is scene understanding and segmentation on which the present work is concerned. In this paper, we present a procedure combining well known methods in a unified learning framework based on stacked classifiers; the benefits are two fold: on one hand, the system scales well to consider different types of complex features and, on the other hand, the use of stacked classifiers makes the performance of the proposed technique more accurate. The proposed method consists of a random forest using random offset features in combination with a conditional random field (CRF) acting on a simple linear iterative clustering (SLIC) superpixel segmentation. The predictions of the CRF are filtered spatially by a multi-scale decomposition before merging it with the original feature set and applying a stacked random forest which gives the final predictions. The model is tested on the renown NYU-v2 dataset and the recently available SUNRGBD dataset. The approach shows that simple multimodal features with the power of using multi-class multi-scale stacked sequential learners (MMSSL) can achieve slight better performance than state of the art methods on the same dataset. The results show an improvement of 2.3% over the base model by using MMSSL and displays that the method is effective in this problem domain.
KW - RGB-D sematic segmentation
KW - Stacked sequential learning
KW - Conditional random fields
KW - Random forests using random offset features
UR - http://www.sciencedirect.com/science/article/pii/S016786551630157X
U2 - 10.1016/j.patrec.2016.06.024
DO - 10.1016/j.patrec.2016.06.024
M3 - Journal article
SN - 0167-8655
VL - 80
SP - 208
EP - 215
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -