A simple request, a short sentence, for the human brain to interpret what is meant, make the connection and initiate an appropriate reaction is easy. For a machine this is much more complicated. To control technical devices with speech requires many individual steps.

Talking to Machines - New Operating Concepts With Artificial Intelligence
Talking to Machines - New Operating Concepts With Artificial Intelligence

Article from | FESTO

 

Detecting and interpreting speech

"Give me a pen!" – this may be a very simple command, but it makes the computer work hard in the background. Firstly, the spoken sentence is turned into text. The speech recognition software must overcome many challenges in order to identify the words used by their frequency patterns: unclear pronunciation, similar-sounding words with different meanings and different intonations or dialects. By comparing them with extensive databases, in which countless examples of words and their frequency patterns are stored, the software works out what the words are.

The next step is working out the meaning of the sentence. To do this, the software sends the text to a language interface that checks it for certain keywords. Beforehand, the programmer must determine all the necessary terms and commands – called intents – as well as their synonyms, and define which action lies behind each of them. For example, ‘give’ is identified as the request to transport an object to a particular place, whilst the word ‘me’ is understood to be a person or an objective of the action.

 

Artificial intelligence finds the optimal solution

Once the interface has identified the meaning of the sentence, it supplies a context object, which is a software code with which the device control system can work. In order to give the machine a clear instruction, the artificial intelligence now gets to work using other software. This evaluates the content of the context object and at the same time gets information from various sensors about the position of the device and its surroundings. The software houses modules for different solutions which are assigned to certain actions.

The program uses all this information to construct a command, for example how and where a gripper arm should move and sends it to the device controller. The sensor technology thus detects where the pen is on the desk and what path the machine must take to pick it up and hand it to a person. The software gradually learns which solution is the best for each action and applies this knowledge to the next action.

All these complex sequences must be made in fractions of a second, because the person expects a prompt and above all correct reaction from the machine. Although after 30 years of application, voice recognition works relatively well, there is still plenty of research and development going on behind the voice control of machines – until at some point we will be able to talk as naturally with a machine as with our neighbour.

 

 

The content & opinions in this article are the author’s and do not necessarily represent the views of RoboticsTomorrow

Comments (0)

This post does not have any comments. Be the first to leave a comment below.


Post A Comment

You must be logged in before you can post a comment. Login now.

Featured Product

uEye XC AUTOMATICALLY PERFECT IMAGES

uEye XC AUTOMATICALLY PERFECT IMAGES

Combining the ease of use of a webcam with the performance and reliability of an industrial camera? The uEye XC autofocus camera from IDS Imaging Development Systems proves that this is possible. Its high-resolution imaging, simple setup and adaptability make it an invaluable tool for improving quality control and streamlining workflows in industrial settings - especially for cases where users would normally employ a webcam. The uEye XC autofocus camera features a 13 MP onsemi sensor and supports two different protocols: USB3 Vision, which enables programmability and customization, and UVC (USB Video Class). The UVC functionality enables a single cable connection for easy setup and commissioning, while delivering high-resolution images and video. This makes the uEye XC camera an ideal option for applications that require quick setup and need to manage variable object distances. Additional features such as digital zoom, automatic white balance and color correction ensure precise detail capture, which is essential for quality control.