Deep Reinforcement Learning for Robot Batching Optimization and Flow Control

Max Hildebrand, Rasmus Skovgaard Andersen, Simon Bøgh

Research output: Contribution to journalConference article in JournalResearchpeer-review

9 Citations (Scopus)
59 Downloads (Pure)

Abstract

Robot batching is an optimization problem found in many industrial applications. Current state-of-the-art approaches utilize a combination of heuristic based parameters and statistical analysis. This approach necessitates many tunable parameters, which again provides challenges when delivering systems to new customers. We challenge current state-of-the-art in statistical approaches by presenting a novel application of a policy gradient method for a Deep Reinforcement Learning (DRL/RL) agent. We have developed a Unity simulation framework of an existing robot- batching cell, on which a RL agent is able to successfully train and obtain a policy for performing robot batching, using a tabula rasa approach. The trained agent is capable of packaging 47.86% of 1218 total batches within the prescribed tolerances, with a positive give-away of 8.76%. The application of DRL in performing robot batching is to the authors knowledge the first of its kind.
Original languageEnglish
JournalProcedia Manufacturing
Volume51
Pages (from-to)1462-1468
Number of pages7
ISSN2351-9789
DOIs
Publication statusPublished - Nov 2020
Event30th International Conference on Flexible Automation and Intelligent Manufacturing - Athens, Greece
Duration: 15 Jun 202118 Jun 2021
https://www.faimconference.org/

Conference

Conference30th International Conference on Flexible Automation and Intelligent Manufacturing
Country/TerritoryGreece
CityAthens
Period15/06/202118/06/2021
Internet address

Keywords

  • Reinforcement Learning
  • Deep Reinforcement Learning
  • Artificial Intelligence
  • Robotics
  • Smart Manufacturing
  • Proximal Policy Optimization
  • Deep Learning

Fingerprint

Dive into the research topics of 'Deep Reinforcement Learning for Robot Batching Optimization and Flow Control'. Together they form a unique fingerprint.

Cite this