For the first time, an AI robot has demonstrated a physical skill that is superior to humans. And while the skill—navigating a marble through a labyrinth is a simple one—the ability hints at a broader implications for the blending of AI and physical prowess. And if every labyrinth has its minotaur, the one here is that an AI robot will “cheat” if the operating rules are vague or imprecise.
The breakthrough comes from researchers at the Institute for Dynamic Systems and Control at ETH Zurich in Switzerland who created an AI robot called CyberRunner tasked with learning how to play a popular labyrinth marble game. The goal is to steer the marble with various tilts of the board to the end point without falling into holes along the way. CyberRunner controls the movement of the marble with two motorized knobs that change the orientation of the board as hands would. While simple, the game requires fine motor skills and spatial reasoning skills.
“This is much less about memorization and solving a maze than being able to navigate the labyrinth without falling into the holes,” explains lead researcher professor Raffaelo D’Andrea of ETH Zurich. D’Andrea is a founder of Kiva Systems now operating as Amazon Robotics, the robotics and AI fund ROBO Global and autonomous indoor flight leader Verify. “It’s about fine motor skills.”
CyberRunner didn’t master the labyrinth immediately. Like humans, CyberRunner required training and learned from experience. While playing the game, CyberRunner captured observations through the “eyes” of a camera aimed downward at the labyrinth and received rewards based on its performance. A memory was kept of the collected experiences and was used to create a model-based reinforcement algorithm to develop strategies and behaviors. The algorithm runs concurrently with the robot playing the game so it gets better run after run.
The bottom line: After 6.06 hours of training, CyberRunner beat the best time ever recorded by a very skilled human player by 6%.
The additional bottom line: An AI robot will cheat if it can get away with it, displaying a very human characteristic. During the learning process, CyberRunner discovered shortcuts that allowed it to skip some parts of the maze. While the path through the labyrinth was marked as a thick line, CyberRunner followed its basic commandment: Navigate through the maze as quickly as possible. CyberRunner was not instructed to not cheat; researchers added extra rules that penalized cheating after the fact. It’s unclear whether an AI robot can “remember” if it cheated and decide to cheat again if results warranted it.
One important question going forward is whether the AI robot’s skills on one labyrinth will apply to another one, an area of AI research called transfer learning. “We have not tried it, but most probably a lot,” D’Andrea told Techstrong.ai. “We know this because the information CyberRunner learns on one part of the labyrinth helps it in other parts.”
Another “very deep question” that is a very active area of research is whether the skills learned by an AI robot like CyberRunner are applicable to a different task that is similar in nature but not identical. “Low level tasks are usually transferable and, in fact, this is one of the reasons why deep learning has been so successful,” says D’Andrea. “The different levels encode different hierarchical levels. But we have not tried it on another activity. This is really a very dedicated robot. Not a general purpose AI, by any means!”
Raffaelo and fellow lead researcher and PhD candidate Thomas Bi are making CyberRunner an open source project available at www.CyberRunner.com. The pair hopes this inspires a “Citizen Science” style AI research.
“Now, for less than $200, anyone can engage in cutting-edge AI research,” says D’Andrea. “Furthermore, once thousands of CyberRunners are out in the real-world, it will be possible to engage in large-scale experiments, where learning happens in parallel, on a global scale. The ultimate in Citizen Science!”
There’s going to be a run on marbles if D’Andrea has his way.