'Basically, it's a skewed distribution, a particular mean, a roundoff error and a specific algorithm, all coming together at a very unfortunate intersection,' says Daan van den Berg. The traveling salesman problem involves finding the shortest tour along cities on a map. 'But the problem is, it gets harder as the number of cities increases. A lot harder. Exponentially harder, to be precise.'
However, the influential study 'Where the Really Hard Problems Are' (1400+ citations) shows that not all maps (or distance matrices, technically speaking) are equally hard. If the distance matrix has a large standard deviation, problem instances are easy. But that turns out to be incorrect, concludes a replication study by UvA's AI students Joeri Sleegers, Richard Olij and Gijs van Horn.
'What started as a seemingly straightforward replication task, turned much more interesting when we discovered some shortcomings. Most notably, a hard-to-spot flaw, which turned out to have major consequences. It took quite a while to get everything on paper, but it was definitely worth it,'says Richard Olij. 'It has been great to be a part of a replication study that contests findings of a classic and well established paper like this, and it has reasserted the value of always challenging assumptions and methods of those who came before. This is a lesson I take to my engineering career as well,' says Gijs van Horn.
Replication studies are not yet fully mainstream the computational sciences, but maybe they should be. As it turns out, the original study used a branch and bound algorithm that prefers short distances, anywhere in the matrix, to be included in the tour first. Because those matrices were generated with a lognormal probability distribution round off to integers, high standard deviations lead to many zeroes in the matrix and hence, to rapid solving of problem instances. They made new matrices and subjected them to the same algorithm, both before and after the roundoff. In the round off matrices, the effect was identical to the original study, in the same matrices before round off, it completely disappeared.
'But the real cement of the story came with some transfer mathematics,' says Van den Berg. 'We could actually predict the chance of existence of a zero-cost tour on forehand. That all came much later, but it did make theory and practice stick together nicely. We probably haven't heard the last of it yet. But most importantly: the worth of replication studies is getting recognized. We're convinced the original authors never accounted for this phenomenon. It's typical for science: we pedestrianally take two steps forward, and one step back. Replication studies weed out the overlooked mistakes from the past, and provide us with great projects for our students. For Joeri, this is his fourth paper during his study at UvA. For Richard and Gijs, it is their second. A notable achievement.'