For best experience please turn on javascript and use a modern browser!
You are using a browser that is no longer supported by Microsoft. Please upgrade your browser. The site may not present itself correctly if you continue browsing.
AI can analyse medical data quickly and accurately, potentially improving diagnoses and treatment plans. But bias is a persistent problem, making medical AI solutions not yet work well for every population. UvA researchers of the Informatics Institute are unfolding the bias problem and make suggestions for improvement.
Shutterstock Tex vector
Shutterstock Tex vector

It is expected that we cannot avoid the application of AI in healthcare in the coming years. Due to the ageing population, among other things, healthcare costs are becoming unmanageable. Moreover, not enough medical personnel can be found to cope with the increased demand for care. AI can provide a solution, as it can automate and speed up many processes. But there is one big problem: bias. For example, the automatic detection of skin cancer using AI has been found to work less well for people with darker skin colours. Similarly, women with heart disease are more likely to be misdiagnosed when AI is used to determine that diagnosis.

I would like to contribute to more robust medical AI solutions, by emphasizing the importance of dataset documentation Maria Galanty
Maria Galanty

Annotation practices

‘Bias refers to a systematic error in machine learning models, affecting their ability to correctly classify subgroups of patients’, says Maria Galanty, PhD student of the Quantitative Healthcare Analysis (qurAI) in the field of AI and healthcare at the UvA. This bias may emerge from using datasets that are not representing the target population well. To detect and mitigate bias, datasets need to be well documented, states Galanty. ‘If crucial information is missing, it is impossible for data scientists to be aware of potential biases.’ Apart from over- and under-representation of a particular group, bias could also emerge because of data annotation practices. How doctors label MRI images, for example, can vary enormously, depending on their individual style and medical background. This limits the reusability of data.

Living abroad

Galanty obtained her double bachelor's degree in mathematics and cognitive sciences at the University of Warsaw (Poland), but decided to move to the Netherlands. ‘I wanted to experience what it would be like to work and live abroad, and was attracted by the Netherlands because of the great education level and work-life balance.’ She followed a master in artificial intelligence at Utrecht University and became interested in the intersection of AI and healthcare. Therefore she decided to apply for a PhD on that topic, ending up at the UvA. Last year, she started research on bias in medical data, together with fellow PhD student Dieuwertje Luitse. As a humanities scholar, Luitse studies the ethical and political aspects of AI development in healthcare. They work together as part of the Research Priority Area (RPA) Artificial Intelligence for Health Decision-making. This is an interdisciplinary hub combining experts from fields such as computer science, medicine, law, and ethics, with the aim to develop ethical, high-quality AI solutions that help patients.   

Checklist

Galanty and Luitse conducted a study on publicly available medical dataset documentation. These public datasets are often reused by machine learning engineers, which means they were not involved in the creation process of the dataset, and therefore have no additional knowledge about it.

The researchers focused on three so-called modalities: Magnetic Resonance Imaging (MRI), Color Fundus Photography (CFP) of the eye and electrocardiograms (ECGs), measuring heart rhythm. Their first step was to create an evaluation tool, a kind of checklist to assess the completeness of the dataset documentation. Galanty: ‘We looked into various sections of dataset documentation, including patient demographic, inclusion criteria and the data annotation process.’

Variety

Galanty and Luitse conclude that there is a lot of variety in the dataset documentation. ‘For example, some information only states that the annotations were performed by a medical professional, while others describe the process in detail. We believe it would be good for dataset creators to always follow guidelines on how to prepare good documentation, so that all relevant information is included.’ After all, good documentation allows data users to be aware of potential biases that may arise. This awareness is a first step towards reducing bias.

‘I would like to contribute to more robust medical AI solutions, by emphasizing the importance of dataset documentation,’ Galanty said. ‘The step from developing machine learning tools at university to application in hospitals is still quite big. If we want a tool to be actually used for patients, it needs to be very well tested and meet all ethical and legal conditions. With more robust solutions, that step will hopefully become a bit smaller.’