Deep neural networks automatically extract features; however, in many cases, the features extracted by the classifier are biased by the classes during the training of the model. Analyzing 3D medical images can be challenging due to the high number of channels in the images, which require long training times when using complex deep models. To address this issue, we propose a two-step approach: (i) We train an autoencoder to reconstruct the input images using some channels in the volume. As a result, we obtain a hidden representation of the images. (ii) Shallow models are then trained with the hidden representation to classify the images using an ensemble of features. To validate the proposed method, we use 3D datasets from the MedMNIST archive. Our results show that the proposed model achieves similar or even better performance than ResNet models, despite having significantly fewer parameters (approximately 14,000 parameters).
|