Analysis and synthesis of medical images using heterogeneous modalities
Updated: Jan 2, 2021
Daniel Gourdeau, PhD student with Prof. Simon Duchesne, is presenting this week his latest work at the Centre de recherche en données massives de l'Université Laval.
Magnetic resonance imaging (MRI) is a widely available imaging modality that does not expose the subject to ionizing radiation. MRI is the prime imaging method when imaging for soft tissues. Multiple pulse sequences can be used to obtain different contrasts. However, missing pulse sequences and imaging artifacts are a problem for data analysis pipelines that depend on the presence of specific sequences. Hence, selective synthesis of a desired sequence and automatic completion of these heterogeneous datasets are desirable.
Traditional multi-modal image synthesis methods are able to create high-quality images but are limited in a practical setting because they can’t handle missing inputs. In this work, we present a hetero-modal image synthesis approach that can synthesize any modality when given only a subset of available modalities. To do so, each input modality is encoded into a 3D modality-specific multi-resolution representation. Previous works have addressed the fusion of these representations using arithmetical operations like mean, maximum, or variance. The downside of these fusion methods is that they require at least two input modalities to define the variance. In this work, we propose to fuse these representations using a 3D attention network that learns to optimally combine these representations to synthesize the desired output. We incorporate recent advances in generative adversarial networks (GANs) like the progressive growth of GANs and the Wasserstein adversarial loss.
We show that our fusion using attention outperforms the maximum/variance fusion method in all synthesis scenarios while being able to synthesize images using only a single input modality. Additionally, the attention module brings interpretability to the image synthesis task by highlighting the most informative locations in the input modalities.