Even before they speak their first words, human babies develop mental models about objects and people. This is one of the key capabilities that allows us humans to learn to live socially and cooperate (or compete) with each other. But for artificial intelligence, even the most basic behavioral reasoning tasks remain a challenge.
Advanced deep learning models can do complicated tasks such as detect people and objects in images, sometimes even better than humans. But they struggle to move beyond the visual features of images and make inferences about what other agents are doing or wish to accomplish.
To help fill this gap, scientists at IBM, the Massachusetts Institute of Technology, and Harvard University have developed a series of tests that will help evaluate the capacity of AI models to reason like children, by observing and making sense of the world.
“Like human infants, it is critical for machine agents to develop an adequate capacity of understanding human minds, in order to successfully engage in social interactions,” the AI researchers write in a new paper that introduces the dataset, called AGENT.