Last year, DeepMind researchers wrote that future AI developers may spend less time programming algorithms and more time generating rich virtual worlds in which to train them.
In a new paper released this week on the preprint server arXiv, it would seem they’re taking the latter part of that prediction very seriously.
The paper’s authors said they’ve created an endlessly challenging virtual playground for AI. The world, called XLand, is a vibrant video game managed by an AI overlord and populated by algorithms that must learn the skills to navigate it.
The game-managing AI keeps an eye on what the game-playing algorithms are learning and automatically generates new worlds, games, and tasks to continuously confront them with new experiences.
The team said some veteran algorithms faced 3.4 million unique tasks while playing around 700,000 games in 4,000 XLand worlds. But most notably, they developed a general skillset not related to any one game, but useful in all of them.
These skills included experimentation, simple tool use, and cooperation with other players. General skills in hand, the algorithms performed well when confronted with new games, including more complex ones, such as capture the flag, hide and seek, and tag.
This, the authors say, is a step towards solving a major challenge in deep learning. Most algorithms trained to accomplish a specific task—like, in DeepMind’s case, to win at games such as Go or Starcraft—are savants. They’re superhuman at the one task they know and useless at the rest. They can defeat world champions at Go or chess, but have to be retrained from scratch to do anything else.
By presenting deep reinforcement learning algorithms with an open-ended, always-shifting world to learn from, DeepMind says their algorithms are beginning to demonstrate “zero-shot” learning at new never-before-seen tasks. That is, they don’t need retraining to perform novel tasks at a decent level—sight-unseen.