A Fast Monocular 6D Pose Estimation Method for Textureless Objects Based on Perceptual Hashing and Template Matching

Jose Moises Araya-Martinez, Vinicius Soares Matthiesen, Simon Bøgh, Jens Lambrecht, Rui Miguel Horta Pimentel de Figueiredo

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Abstract

Object pose estimation is essential for computer vision applications such as quality inspection, robotic bin picking, and warehouse logistics. However, this task often requires expensive equipment such as 3D cameras or Lidar sensors, as well as significant computational resources.Many state-of-the-art methods for 6D pose estimation depend on deep neural networks, which are computationally demanding and require GPUs for real-time performance. Moreover, they usually involve collection and labeling of large training datasets, which is costly and time-consuming.We propose a template-based matching algorithm that utilizes a novel perceptual hashing method for binary images, enabling fast and robust pose estimation. This approach allows the automatic preselection of a subset of templates, significantly reducing inference time while maintaining similar accuracy. Our solution runs efficiently on multiple devices without GPU support, offering reduced runtime and high accuracy on cost-effective hardware.We benchmarked our proposed approach on a body-in-white automotive part, relevant to the automotive industry and on a widely-used publicly available dataset. Our set of experiments, on a synthetically generated dataset reveals a superior trade-off between accuracy and computation time compared to a previous work evaluated on the same automotive-production use case. The algorithm Additionally, our algorithm efficiently utilizes all CPU cores and includes adjustable parameters for balancing computation time and accuracy, making it suitable for a wide range of 1 Araya-Martinez et al.applications where hardware cost and power efficiency are critical. For instance, with a rotation step of 10°in the template database, we achieve an average rotation error of 10°, matching the template quantization level, and an average translation error of 14cm 14% of the object's size, with an average processing time of 0.3s per image on an small form-factor Nvidia AGX Orin device. We also evaluate robustness under partial occlusions (up to 10% occlusion) and noisy inputs (SNRs up to 10dB), with only minor losses in accuracy. Additionally, we compare our method to state-of-the-art deep learning models on a public dataset. While our algorithm does not outperform them in absolute accuracy, it provides a more favorable trade-off between accuracy and processing time, which is especially relevant to applications employing resource-constrained devices.
OriginalsprogEngelsk
TidsskriftFrontiers in Robotics and AI
Vol/bind11
DOI
StatusUdgivet - 8 jan. 2025

Fingeraftryk

Dyk ned i forskningsemnerne om 'A Fast Monocular 6D Pose Estimation Method for Textureless Objects Based on Perceptual Hashing and Template Matching'. Sammen danner de et unikt fingeraftryk.

Citationsformater