Reinforcement learning is an approach adopted from artificial intelligence (AI) that mathematically replicates natural learning. The chocolate factory is an example of how application-specific optimised controls can be developed much faster in this way.

AI and the Chocolate Factory

Article from | Siemens

Several conveyor belts transport chocolate bars: They are part of the demonstrator machine that shows how Artificial Intelligence can be used for motion control. What remains to be done in a real factory is to pack the chocolate bars – automated, of course. In this Intelligent Infeed Demonstrator machine from Siemens Digital Industries, the chocolate bars must be placed in evenly spaced slots on the outfeed belt. “The bars are placed on the inlet belt at random intervals,” says Martin Bischoff, Expert in Virtual Mechatronics at Technology, the research division at Siemens. “The system controller achieves this by altering the speeds of the conveyor belts. A line of three conveyor belts can be accelerated or slowed down to ensure the chocolate is positioned correctly on the outlet belt. The development of an optimized control algorithm for this application is a tricky programming task – if you don’t believe it: just try it yourself. Via Reinforcement Learning, we have trained an artificial Intelligence controller to realize this task.”


Reinforcement Learning: failure and success

Reinforcement learning is an artificial intelligence method that works in much the same way as most people learn to ride a bicycle - by trial and error, without any knowledge of the basic physics: The novice cyclist experiences whether his or her own technique is good directly during the riding tests and thus gradually becomes better and better. “This is exactly how reinforcement learning works,” explains Michel Tokic, a fellow expert at Technology and a lecturer in applied reinforcement learning at Munich’s Ludwig Maximilian University. “The AI is given a target specification, such as 'the candy bars may only be placed in the target fields, and the system should work as quickly as possible in the process'. The AI then makes – initially completely random – control attempts on the simulation model and receives feedback, triggered by light barrier signals, on how good each attempt was. With this feedback, a goal-directed control algorithm emerges after many automated training cycles.”


Training on the Digital Twin

Errors in a plant control system can have expensive or dangerous consequences. For this reason, controls are developed and tested on digital twins of the plants without risk as standard  (Siemens Virtual Commissioning). The digital twin of the plant can also be used to train the AI.

"After about 72 hours of training with the digital twin (on a standard computer; about 24 hours on computer clusters in the cloud), the AI is ready to control the real machine. That's definitely much faster than humans developing these control algorithms," Bischoff says. Using reinforcement learning, the AI has developed a solution strategy in which all the chocolate bars on the front conveyor belts are transported on as quickly as possible and the exact speed is only controlled on the last conveyor belt - is interestingly quite different from that of a conventional control system.

Martin Bischoff (r), Michel Tokic (l) and their team have succeeded in applying AI to control tasks, training them independently on (simulation models known as) digital twins and then loading them onto Siemens SIMATIC controllers.


From the Lab to Industrial Deployment: The AI Motion Framework

The researchers led by Martin Bischoff were able to make their approach even more practical by compressing and compiling the trained control models in such a way that they run cycle-synchronously on the Siemens Simatic controllers in real time. Thomas Menzel, who is responsible for the department Digital Machines and Innovation within the business segment Production Machines, sees great potential in the methodology of letting AI learn complex control tasks independently on the digital twin: "Under the name AI Motion Trainer, this method is now helping several co-creation partners to develop application-specific optimized controls in a much shorter time. Production machines are now no longer limited to tasks for which a PLC control program has already been developed but can realize all tasks that can be learned by AI. The integration with our SIMATIC portfolio makes the use of this technology particularly industry-grade."  


The content & opinions in this article are the author’s and do not necessarily represent the views of RoboticsTomorrow

Comments (0)

This post does not have any comments. Be the first to leave a comment below.

Post A Comment

You must be logged in before you can post a comment. Login now.

Featured Product

FAULHABER MICROMO - Game changer in logistics

FAULHABER MICROMO - Game changer in logistics

Faster, more efficient, more sustainable - due to global competition in industry combined with booming online trade, transport structures in intralogistics are facing new challenges. The industries' answer: Automation. From storage to shipping, key work steps are being taken over by intelligent logistics robots, such as automatic storage and retrieval machines and driverless transport systems. To work efficiently and reliably around the clock, these robots need flexible and particularly compact drive solutions.