AI and machine learning that understands language



U.S. military researchers want to make English text and speech readily understandable to computers by creating an artificial intelligence (AI) prototype that can learn language in much the same way as a young child does -- from visual and auditory cues.



The idea is to help computers make quick and efficient use of written and spoken language to help military commanders and intelligence analysts make quick decisions based on the resources they have at hand.

Officials of the U.S. Defense Advanced Research Projects Agency (DARPA) in Arlington, Va., issued a solicitation last week (DARPA-PA-18-02-06)for the Grounded Artificial Intelligence Language Acquisition (GAILA) project.

The goal is to develop a model for grounded language acquisition and an automated language-acquisition prototype that learns to understand English text and speech for making the information more useable by automated analytics.

Children acquire language based on their perceptions of sounds and images, researchers explain; they link moving images with corresponding sounds, using only a tiny fraction of the examples that AI and machine learningsystems require.


Sequencing information, variations of word forms, and other information helps children make ever-finer classifications of what they learn. If AI computers could learn language like children do, they could see vast improvement in their abilities in machine translation, information retrieval, name entity detection, event detection, and knowledge base creation.

The accuracy of AI and machine learning computers today suffers from their need for large amounts of annotated data for training. Machine learning technology, moreover, is incapable of dealing with new data sources, topics, media, and vocabularies.[Native Advertisement]

Instead, DARPA researchers want industry to develop a computer prototype that starts with no language skills, and learns associate text and speech with live scenes, images, or video. It will use logic, heuristics, and inference to describe previously unseen objects, relations, and events.

For example, after seeing a black table, a white table, and a black chair, the prototype should be able to learn enough about the meanings of the words “black,” “white,” “table,” and “chair” to recognize and describe a white chair. As the computer evolves, it should be able to describe increasingly complex events and relations.


GAILA is fundamentally different from previous semantics and language-learning efforts because it will use visual cues to describe what it experiences before, during, and after an event.

The prototype, in the learning stage, must be able to accept text and speech and build internal language representations. GAILA software must be able to accept images, video, or virtual visual scenes depicting previously unseen things, and produce English descriptions that capture salient elements.

Companies interested should upload eight-page proposals no later than 26 April 2019 to the DARPA BAA Website at https://baa.darpa.mil.