Artificial Intelligence Models Predict How the Brain Processes Language


These models can not only predict the word that comes next, but also perform tasks that seem to require some degree of genuine understanding, such as question answering, document summarization, and story completion.

Computer models that perform well on other types of language tasks do not show this similarity to the human brain, offering evidence that the human brain may use next-word prediction to drive language processing.


“The better the model is at predicting the next word, the more closely it fits the human brain,” says Nancy Kanwisher, the Walter A. Rosenblith Professor of Cognitive Neuroscience, a member of MIT’s McGovern Institute for Brain Research.

The new, high-performing next-word prediction models belong to a class of models called deep neural networks. These networks contain computational “nodes” that form connections of varying strength, and layers that pass information between each other in prescribed ways.

In the new study, the MIT team used a similar approach to compare language-processing centers in the human brain with language-processing models.

The researchers analyzed 43 different language models, including several that are optimized for next-word prediction. These include a model called GPT-3 (Generative Pre-trained Transformer 3), which, given a prompt, can generate text similar to what a human would produce.

Other models were designed to perform different language tasks, such as filling in a blank in a sentence.

As each model was presented with a string of words, the researchers measured the activity of the nodes that make up the network.

They then compared these patterns to activity in the human brain, measured in subjects performing three language tasks: listening to stories, reading sentences one at a time, and reading sentences in which one word is revealed at a time.

These human datasets included functional magnetic resonance (fMRI) data and intracranial electrocorticographic measurements taken in people undergoing brain surgery for epilepsy.

They found that the best-performing next-word prediction models had activity patterns that very closely resembled those seen in the human brain.

Activity in those same models was also highly correlated with measures of human behavioral measures such as how fast people were able to read the text.

One of the key computational features of predictive models such as GPT-3 is an element known as a forward one-way predictive transformer.

This kind of transformer can make predictions of what is going to come next, based on previous sequences.

Scientists have not found any brain circuits or learning mechanisms that correspond to this type of processing. However, the new findings are consistent with hypotheses that have been previously proposed that prediction is one of the key functions in language processing.

The researchers now plan to build variants of these language processing models to see how small changes in their architecture affect their performance and their ability to fit human neural data.

They also plan to try to combine these high-performing language models with some computer models previously developed that can perform other kinds of tasks such as constructing perceptual representations of the physical world.

Amazingly, the models fit so well, and it very indirectly suggests that maybe what the human language system is doing is predicting what’s going to happen next.

Source: Medindia



Source link