The preliminary evaluation phase for MIDOG 2025, our this year’s MICCAI challenge, is coming up in less than four days. Time to talk about why it’s there and why not:
– Docker submission is not easy, and we know. This is why, we prepared templates to fill out so everything runs smoothly. Still, there is a thousand reasons why it won’t. So this is why we have the evaluation phase: Participants can see if their solution works, or not.
– What this is explicitly not meant for, is trying to optimize algorithms on it. We know, it’s tempting to try to one-up your own score on the public leaderboard of the preliminary evaluation set – but, as the previous challenges have shown, it typically won’t help you in the final evaluation, as it leads to a dataset overfit. The challenge test set has different statistical properties: It uses different tumor types, and, in the case of track 1, even different selection criteria: The preliminary test set is all hotspot ROIs, while in the final test set, there are also random ROIs and challenging ROIs.
TL;DR: Smart participants don’t optimize on the preliminary evaluation set, but find a better proxy.