Interview with Massimo Stella: the Cognitive Biases of Artificial Intelligences

Massimo Stella, a professor at DiPSCo, studies the effects (both negative and positive) of GPT linguistic models on our minds.

In recent months, the term "GPT" has been on almost everyone's lips. But how many of us know what happens behind the scenes? What can be the consequences of using these artificial intelligences? Can they in some way influence our society and our culture? We discussed this with Prof. Massimo Stella, a professor at DiPSCo.

Prof. Stella, you recently published a letter in PNAS that discusses the biases of GPT linguistic models. Can you tell us about the idea behind this letter?

The basic concept of the letter is very simple: all the theories of cognitive science that have been developed over the last 60 years are based on experiments conducted on human beings; with the arrival of linguistic models like GPT-4, however, all our knowledge in this area is being dismantled.

The cognitive architecture of these systems is, in fact, completely different from our own. Language models are essentially neural networks that are exposed to a huge amount of data and are continuously trained through reinforcement learning.

With the increasingly widespread use of GPT language models as a source of information and as a means of communication, it is therefore essential to start focusing not only on human biases but also on non-human ones, that is, on those biases that are characteristic of these language models.

What are the main biases of GPT language models?

Currently, we are aware of two biases of GPT linguistic models: myopic overconfidence and hallucinations.

In the first case, we are dealing with excessive confidence in one's own statements.

Hallucinations, on the other hand, occur when linguistic models must produce knowledge on topics they are unfamiliar with. In these cases, they try to interpolate the text based on the little information they have, with results that resemble dystopian novels by Philip K. Dick. Very often, for example, hallucinating GPT models produce paragraphs in which the same set of words is repeated several times.

Another problem with these systems is their inability to filter out false information. Unlike us humans, GPT linguistic models cannot understand either the source of information or the reliability of a particular source. This means that if one enters low-quality information into the system, due to reinforcement learning, this false information will be over-sampled, and the system will start reproducing it as if it were true.

With your laboratory, CogNosco, you won a grant to study these biases. What does this entail?

Together with my colleague Prof. Giuseppe Alessandro Veltri, we won the University of Trento's internal grant "Call for Research 2023". With our project, we will investigate the structure of knowledge of GPT language models to understand where and how the information they produce may be distorted and then potentially transmitted to humans.

It will be particularly interesting to collect and analyze these data because GPT models were niche systems until just a few years ago; therefore, there are currently no studies regarding the transmission of ideas from GPT to human beings.

What consequences could potentially distorted information produced by these artificial intelligences lead to?

Stereotypes, hate speech, and biases can be learned, reproduced, and amplified with great ease by these systems. These linguistic models reach millions and millions of users around the world who, unaware of the functioning of GPT models and their possible biases, could be influenced by these negative perceptions without even being aware of it.

Can you give us an example of what we mean by "constructing distorted knowledge"?

Take the example of mathematics: it is a discipline like many others, neither negative nor positive. However, there is a very common stereotype, also found among numerous students, that sees mathematics as something negative.

In research that we conducted with Giulio Rossetti, Salvatore Citraro, Luigi Lombardi, and Katherine Abramski, a doctoral student in the "AI for Society" program in Pisa, we found that even GPT-3, GPT-3.5, and GPT-4 produce negative and stereotyped associations with mathematics, just as high school students would do.

How can we use the data resulting from the mapping of the biases of these artificial intelligences?

This aspect fully captures the second objective of our project. After completing the mapping of the biases of GPT artificial intelligences, in fact, we want to understand the influence that these biases can have on the minds of human beings.

For example, it is not certain that language models only have a negative effect: they could also have a beneficial power on people. Unfortunately, at the moment we are dealing with very inconsistent models that easily adapt to what people want to hear.

Therefore, we want to try to understand which groups of people are more exposed to the biases of these linguistic models, both negatively and positively: who is most at risk of possible contamination of distorted knowledge? And who is more capable of benefiting from conversations with GPT systems?

Further Information

The CogNosco Laboratory is part of the Department of Psychology and Cognitive Science at the University of Trento and is directed by Professors Massimo Stella and Luigi Lombardi. For more information: Cognosco Lab