We use causal reasoning to shed new light on key challenges in medical imaging: 1) data scarcity, which is the limited availability of high-quality annotations, and 2) data mismatch, whereby a trained algorithm may fail to generalize in clinical practice. We argue that causal relationships between images, annotations, and data-collection processes can not only have profound effects on the performance of predictive models, but may even dictate which learning strategies should be considered in the first place. Semi-supervision, for example, may be unsuitable for image segmentation - one of the possibly surprising insights from our causal considerations in medical image analysis. We conclude that it is of utmost importance for the success of machine-learning-based image analysis that researchers are aware of and account for the causal relationships underlying their data.
Back to Long Programs