Can we stop ChatGPT from spreading bias?

21 april 2026

Language models like ChatGPT are not neutral. Without our realising it, they can absorb all kinds of bias – for example around gender and ethnicity – which then become increasingly embedded in the model. According to AI researcher Oskar van der Wal, we need different kinds of measurements to detect these biases so that they can be removed from the models. In his doctoral thesis, he shows how this can be done. On 29 April, he will defend his thesis at the University of Amsterdam.

Language models are often seen as neutral tools, but in practice they can both reflect and amplify bias.

‘Users often don’t realise that a model makes certain assumptions, for example by introducing subtle differences in how men and women are described,’ says Van der Wal. Precisely because bias is so hidden, it can spread unnoticed and colour the way we see the world.

Bias is hard to measure

An important problem is that bias is difficult to measure. ‘Many existing measurement methods are fairly abstract and don’t take practice into account. They might look for overt stereotypes in what the model says, such as “The Dutch are stingy.” But in practice, bias isn’t something that’s directly visible. It depends on the context in which you use the model.’

Van der Wal cites the use of AI in healthcare as an example. ‘AI learns from existing data. If those data contain outdated or incorrect assumptions – for instance, the contested idea that certain diseases are linked to the outdated concept of “race” – the model may keep reproducing them. In healthcare, that can lead to incorrect diagnoses or treatments.’

Another example is when medical data largely derives from research involving men. ‘AI may then interpret women’s symptoms differently or less seriously, or make different risk assessments.’

Realistic scenarios

To discover whether realistic scenarios reveal different errors than simple tests, Van der Wal presented language models with a range of medical cases and asked them to provide diagnoses, risk assessments or advice. ‘We repeatedly changed the patient’s ethnicity. That way we could identify whether and how the model responded differently.’

Subtle but consistent differences appeared in the outcomes, differences that remained invisible in standard tests. ‘Precisely because our scenarios were close to practice, it became clear how bias can influence medical decision-making.’

Model reinforces patterns in the data

Van der Wal also investigated what happens inside a language model during training. He followed, step by step, how the model learns to store information. ‘During training, the model learns which words and ideas frequently occur together. If “doctor” often appears together with “he” and “nurse” with “she” in the training data, the model will pick up on those associations.’

Over time, the model appeared to store this information in increasingly specific places, thereby reinforcing gender bias. ‘Bias doesn’t arise only from the data that AI is trained on, but also from the way the model structures that information.’

**There are solutions**

Unfortunately, you can’t fix bias in language models with a single trick. But, according to Van der Wal, targeted interventions can help. ‘If you know where in the model the bias is located, you can address those areas. This already seems to work in specific cases, but more research is needed to extend the approach to more complex forms of bias.’

Van der Wal tested this targeted approach by comparing a model before and after an adjustment in which the model was trained not to adopt identified gender-related biases. He wanted to see if the model responded less differently to men and women after the change, and how well it still performed ordinary tasks, such as generating text.

The bias decreased, while the quality of the model largely remained intact.

Careful and deliberate

The impact of AI is not restricted to the technical realm but now has broader societal relevance. ‘We are becoming increasingly dependent on systems that can influence how we think,’ says Van der Wal. ‘That’s precisely why it’s important to develop AI carefully. Responsible AI development requires interventions at multiple levels at once: in the data, during training, targeted within the model itself, and also in its deployment and use.’

How can you as a user carefully use AI?

Be critical of answers: Don’t automatically assume an AI answer is correct or complete. Ask yourself: what am I not seeing? And where does the answer come from? ‘A model can come across as very confident, making its answers seem more reliable than they are,’ warns Van der Wal. ‘It’s also tempting to trust a chatbot that always agrees with you and is very complimentary. But that’s precisely when it’s even more important to stay critical.’
Be aware of hidden risks: Bias and other effects (such as influencing your thinking) are often not immediately visible. That’s why it’s important to stay alert.
Avoid becoming dependent: Use AI as a tool, but keep thinking and deciding for yourself. Over-reliance can make you less confident in your own knowledge and judgement.

Defence details

Oskar van der Wal, 2026, 'Taking a Step Back: Measuring and Mitigating Bias in Language Models'. Supervisors: Dr K. Schulz.and Dr W.H. Zuidema.

Time and location

Wednesday April 29, 16.00-17.30, Agnietenkapel, Amsterdam