Depth Value Pre-Processing for Accurate Transfer Learning Based RGB-D Object Recognition

Andreas Aakerberg; Kamal Nasrollahi; Christoffer Bøgelund Rasmussen; Thomas B. Moeslund

doi:10.5220/0006511501210128

Depth Value Pre-Processing for Accurate Transfer Learning Based RGB-D Object Recognition

Andreas Aakerberg, Kamal Nasrollahi, Christoffer Bøgelund Rasmussen, Thomas B. Moeslund

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

12 Citations (Scopus)

954 Downloads (Pure)

Abstract

Object recognition is one of the important tasks in computer vision which has found enormous applications.Depth modality is proven to provide supplementary information to the common RGB modality for objectrecognition. In this paper, we propose methods to improve the recognition performance of an existing deeplearning based RGB-D object recognition model, namely the FusionNet proposed by Eitel et al. First, we showthat encoding the depth values as colorized surface normals is beneficial, when the model is initialized withweights learned from training on ImageNet data. Additionally, we show that the RGB stream of the FusionNetmodel can benefit from using deeper network architectures, namely the 16-layered VGGNet, in exchange forthe 8-layered CaffeNet. In combination, these changes improves the recognition performance with 2.2% incomparison to the original FusionNet, when evaluating on the Washington RGB-D Object Dataset.

Original language	English
Title of host publication	International Joint Conference on Computational Intelligence
Publisher	SCITEPRESS Digital Library
Publication date	2017
Pages	121-128
ISBN (Print)	978-989-758-274-5
DOIs	https://doi.org/10.5220/0006511501210128
Publication status	Published - 2017
Event	International Joint Conference on Computational Intelligence - Funchal, Portugal Duration: 1 Nov 2017 → 3 Nov 2017 Conference number: 9 http://www.ijcci.org/

Conference

Conference	International Joint Conference on Computational Intelligence
Number	9
Country/Territory	Portugal
City	Funchal
Period	01/11/2017 → 03/11/2017
Internet address	http://www.ijcci.org/

Keywords

Deep Learning
Surface Normals
Computer Vision
Artificial Vision
RGB-D
Convolutional Neural Networks
TransferLearning

Access to Document

10.5220/0006511501210128

SurfNormals_FusionNet_IJCCI_2017-2Accepted author manuscript, 2.41 MB

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{0400b36cce8b4e3cb1a0d911b1678321,

title = "Depth Value Pre-Processing for Accurate Transfer Learning Based RGB-D Object Recognition",

abstract = "Object recognition is one of the important tasks in computer vision which has found enormous applications.Depth modality is proven to provide supplementary information to the common RGB modality for objectrecognition. In this paper, we propose methods to improve the recognition performance of an existing deeplearning based RGB-D object recognition model, namely the FusionNet proposed by Eitel et al. First, we showthat encoding the depth values as colorized surface normals is beneficial, when the model is initialized withweights learned from training on ImageNet data. Additionally, we show that the RGB stream of the FusionNetmodel can benefit from using deeper network architectures, namely the 16-layered VGGNet, in exchange forthe 8-layered CaffeNet. In combination, these changes improves the recognition performance with 2.2% incomparison to the original FusionNet, when evaluating on the Washington RGB-D Object Dataset.",

keywords = "Deep Learning , Surface Normals, Computer Vision, Artificial Vision, RGB-D, Convolutional Neural Networks, TransferLearning",

author = "Andreas Aakerberg and Kamal Nasrollahi and Rasmussen, {Christoffer B{\o}gelund} and Moeslund, {Thomas B.}",

year = "2017",

doi = "10.5220/0006511501210128",

language = "English",

isbn = "978-989-758-274-5",

pages = "121--128",

booktitle = "International Joint Conference on Computational Intelligence",

publisher = "SCITEPRESS Digital Library",

note = "International Joint Conference on Computational Intelligence, IJCCI ; Conference date: 01-11-2017 Through 03-11-2017",

url = "http://www.ijcci.org/",

}

Aakerberg, A , Nasrollahi, K, Rasmussen, CB & Moeslund, TB 2017, Depth Value Pre-Processing for Accurate Transfer Learning Based RGB-D Object Recognition. in International Joint Conference on Computational Intelligence. SCITEPRESS Digital Library, pp. 121-128, International Joint Conference on Computational Intelligence, Funchal, Portugal, 01/11/2017. https://doi.org/10.5220/0006511501210128

Depth Value Pre-Processing for Accurate Transfer Learning Based RGB-D Object Recognition. / Aakerberg, Andreas ; Nasrollahi, Kamal; Rasmussen, Christoffer Bøgelund et al.
International Joint Conference on Computational Intelligence. SCITEPRESS Digital Library, 2017. p. 121-128.

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

TY - GEN

T1 - Depth Value Pre-Processing for Accurate Transfer Learning Based RGB-D Object Recognition

AU - Aakerberg, Andreas

AU - Nasrollahi, Kamal

AU - Rasmussen, Christoffer Bøgelund

AU - Moeslund, Thomas B.

N1 - Conference code: 9

PY - 2017

Y1 - 2017

N2 - Object recognition is one of the important tasks in computer vision which has found enormous applications.Depth modality is proven to provide supplementary information to the common RGB modality for objectrecognition. In this paper, we propose methods to improve the recognition performance of an existing deeplearning based RGB-D object recognition model, namely the FusionNet proposed by Eitel et al. First, we showthat encoding the depth values as colorized surface normals is beneficial, when the model is initialized withweights learned from training on ImageNet data. Additionally, we show that the RGB stream of the FusionNetmodel can benefit from using deeper network architectures, namely the 16-layered VGGNet, in exchange forthe 8-layered CaffeNet. In combination, these changes improves the recognition performance with 2.2% incomparison to the original FusionNet, when evaluating on the Washington RGB-D Object Dataset.

AB - Object recognition is one of the important tasks in computer vision which has found enormous applications.Depth modality is proven to provide supplementary information to the common RGB modality for objectrecognition. In this paper, we propose methods to improve the recognition performance of an existing deeplearning based RGB-D object recognition model, namely the FusionNet proposed by Eitel et al. First, we showthat encoding the depth values as colorized surface normals is beneficial, when the model is initialized withweights learned from training on ImageNet data. Additionally, we show that the RGB stream of the FusionNetmodel can benefit from using deeper network architectures, namely the 16-layered VGGNet, in exchange forthe 8-layered CaffeNet. In combination, these changes improves the recognition performance with 2.2% incomparison to the original FusionNet, when evaluating on the Washington RGB-D Object Dataset.

KW - Deep Learning

KW - Surface Normals

KW - Computer Vision

KW - Artificial Vision

KW - RGB-D

KW - Convolutional Neural Networks

KW - TransferLearning

U2 - 10.5220/0006511501210128

DO - 10.5220/0006511501210128

M3 - Article in proceeding

SN - 978-989-758-274-5

SP - 121

EP - 128

BT - International Joint Conference on Computational Intelligence

PB - SCITEPRESS Digital Library

T2 - International Joint Conference on Computational Intelligence

Y2 - 1 November 2017 through 3 November 2017

ER -

Depth Value Pre-Processing for Accurate Transfer Learning Based RGB-D Object Recognition

Abstract

Conference

Keywords

Access to Document

AUB Link

Fingerprint

Cite this