Performance of a convolutional neural network (CNN) based white-matter lesion segmentation in magnetic resonance (MR) brain images was evaluated under various conditions involving different levels of image preprocessing and augmentation applied and different compositions of the training dataset. On images of sixty multiple sclerosis patients, half acquired on one and half on another scanner of different vendor, we first created a highly accurate multi-rater consensus based lesion segmentations, which were used in several experiments to evaluate the CNN segmentation result. First, the CNN was trained and tested without preprocessing the images and by using various combinations of preprocessing techniques, namely histogram-based intensity standardization, normalization by whitening, and train dataset augmentation by flipping the images across the midsagittal plane. Then, the CNN was trained and tested on images of the same, different or interleaved scanner datasets using a cross-validation approach. The results indicate that image preprocessing has little impact on performance in a same-scanner situation, while between-scanner performance benefits most from intensity standardization and normalization, but also further by incorporating heterogeneous multi-scanner datasets in the training phase. Under such conditions the between-scanner performance of the CNN approaches that of the ideal situation, when the CNN is trained and tested on the same scanner dataset.
|