Voor de beste ervaring schakelt u JavaScript in en gebruikt u een moderne browser!
Je gebruikt een niet-ondersteunde browser. Deze site kan er anders uitzien dan je verwacht.
Children often think AI is much smarter than they are. When AI does their homework, it has convincing answers to everything. But when it comes to reasoning with new information, children prove to have the upper hand. In a reasoning challenge between children, adults and AI, children easily defeated AI, according to research from the University of Amsterdam and the Santa Fe Institute.

Solving analogy puzzles

The researchers compared the performance of children aged 7 to 9, adults, and four of today's leading AI models – including ChatGPT – on a series of analogy puzzles. An analogy puzzle is a reasoning task where you search for connections between different situations. Such a puzzle isn't about what things are, but about how they relate to each other. For example: Body is to feet as tree is to …(roots)? Or: horse is to stable as chicken is to… (chicken coop)?

Three alphabets

The study used text-based puzzles. ‘Language models still have a lot of trouble understanding visual puzzles,’ explains lead researcher Claire Stevenson of the University of Amsterdam. ‘But the puzzle also couldn't contain difficult words that children don't understand.’ Therefore, they chose letter sequences. ‘You hardly need any specialised knowledge for that,’ says Stevenson. ‘And that gives you a level playing field for discovering how people and AI solve analogies.’                                                                                                         

The children, adults, and AI had to predict letter sequences that changed continuously according to one or more rules. For example, if 'ab' changes to 'ac', what should happen to 'gh'? They then had to apply the same logic to other alphabets: the Greek alphabet and an alphabet with random symbols.

Children score 67%, AI only 34%

The results were clear: children and adults applied their knowledge easily to the unfamiliar domains – the Greek and symbol alphabets – whereas the AI models struggled. This was particularly the case with the symbol alphabet: while children averaged 67% of the problems correct, and even performed better in this new and unfamiliar alphabet, the models sometimes dropped below 20%.

Fundamental differences between humans and AI

According to the researchers, this demonstrates a fundamental difference between human and artificial reasoning. ‘Even young children intuitively understand that an alphabet is an ordered sequence,’ explains Stevenson. ‘AI models lack this abstract insight: they primarily recognise patterns in situations they are already familiar with. As soon as the context changes, they  appear unable to apply the underlying structure.’

It all starts with psychology

Flexibly applying knowledge in new situations remains a hallmark of human intelligence, at least for now. It remains to be seen whether artificial intelligence will ever be able to do this in a similar way.

‘In AI development, we are increasingly looking at how people act and think,’ concludes Stevenson. ‘How do babies develop, for example, and does that provide guidance on how AI can best learn? So, it all really starts with psychology!’

Article details

Claire E. Stevenson, Alexandra Pafford, Han L. J. van der Maas and Melanie Mitchell, 2025, Can large language models generalize analogy solving like children can?’ The article will appear in the upcoming issue of Transactions of the Association for Computational Linguistics (January 2026), but can already be viewed here.

This research was co-funded by the Dutch Research Council (NWO) within the project ‘Learning to solve analogies: Why do children excel where AI models fail?’

Dr. C.E. (Claire) Stevenson

Faculteit der Maatschappij- en Gedragswetenschappen

Programmagroep: Psychological Methods