MIT teaching machines to reason about what they see
Deep learning systems interpret the world by picking out statistical patterns in data. This form of machine learning is now everywhere, automatically tagging friends on Facebook, narrating Alexa’s latest weather forecast and delivering fun facts via Google search. But statistical learning has its limits. It requires tons of data, has trouble explaining its decisions and is terrible at applying past knowledge to new situations. For example, a child who has never seen a pink elephant can still describe one, but a computer cannot. It can’t comprehend an elephant that’s pink instead of grey.
“The computer learns from data,” said Jiajun Wu, a PhD student at MIT. “The ability to generalise and recognise something you’ve never seen before — a pink elephant — is very hard for machines.”
To give computers the ability to reason more like us, artificial intelligence (AI) researchers are returning to abstract, or symbolic, programming. Popular in the 1950s and 1960s, symbolic AI wires in the rules and logic that allow machines to make comparisons and interpret how objects and entities relate. Symbolic AI uses less data, records the chain of steps it takes to reach a decision and, when combined with the brute processing power of statistical neural networks, it can even beat humans in a complicated image comprehension test.
A new study by a team of researchers at MIT, MIT-IBM Watson AI Lab and DeepMind shows the promise of merging statistical and symbolic AI. Led by Wu and Joshua Tenenbaum, a professor in MIT’s Department of Brain and Cognitive Sciences and the Computer Science and Artificial Intelligence Laboratory, the team shows that its hybrid model can learn object-related concepts like colour and shape, and leverage that knowledge to interpret complex object relationships in a scene. With minimal training data and no explicit programming, their model could transfer concepts to larger scenes and answer increasingly tricky questions as well as or better than its state-of-the-art peers.
“One way children learn concepts is by connecting words with images,” said the study’s lead author Jiayuan Mao, an undergraduate at Tsinghua University who worked on the project as a visiting fellow at MIT. “A machine that can learn the same way needs much less data, and is better able to transfer its knowledge to new scenarios.”
“The trick, it turns out, is to add more symbolic structure, and to feed the neural networks a representation of the world that’s divided into objects and properties rather than feeding it raw images,” said Jacob Andreas, another researcher. “This work gives us insight into what machines need to understand before language learning is possible.”
The MIT-IBM team is now working to improve the model’s performance on real-world photos and extending it to video understanding and robotic manipulation.
New robotics and automation precinct opens in WA
The WA Government has officially opened what it says will be Australia's largest robotics and...
International robot federated learning project a success
The FLAIROP international research project has shown AI federated learning across multiple...
Rockwell to partner with Taurob to provide robotic inspection solutions
Rockwell Automation has announced it will partner with Austrian company Taurob to provide a...