Cross modal global local representation learning from radiology reports and x-ray chest images

Nathan Hadjiyski; Ali Vosoughi; Axel Wismüller

doi:10.1117/12.2654520

7 April 2023 Cross modal global local representation learning from radiology reports and x-ray chest images

Nathan Hadjiyski, Ali Vosoughi, Axel Wismüller

Proceedings Volume 12465, Medical Imaging 2023: Computer-Aided Diagnosis; 1246531 (2023) https://doi.org/10.1117/12.2654520
Event: SPIE Medical Imaging, 2023, San Diego, California, United States

Conference Poster

Abstract

Deep learning models can be applied successfully in real-work problems; however, training most of these models requires massive data. Recent methods use language and vision, but unfortunately, they rely on datasets that are not usually publicly available. Here we pave the way for further research in the multimodal language-vision domain for radiology. In this paper, we train a representation learning method that uses local and global representations of the language and vision through an attention mechanism and based on the publicly available Indiana University Radiology Report (IU-RR) dataset. Furthermore, we use the learned representations to diagnose five lung pathologies: atelectasis, cardiomegaly, edema, pleural effusion, and consolidation. Finally, we use both supervised and zero-shot classifications to extensively analyze the performance of the representation learning on the IU-RR dataset. Average Area Under the Curve (AUC) is used to evaluate the accuracy of the classifiers for classifying the five lung pathologies. The average AUC for classifying the five lung pathologies on the IU-RR test set ranged from 0.85 to 0.87 using the different training datasets, namely CheXpert and CheXphoto. These results compare favorably to other studies using UI-RR. Extensive experiments confirm consistent results for classifying lung pathologies using the multimodal global local representations of language and vision information.

Citation Download Citation

Nathan Hadjiyski, Ali Vosoughi, and Axel Wismüller "Cross modal global local representation learning from radiology reports and x-ray chest images", Proc. SPIE 12465, Medical Imaging 2023: Computer-Aided Diagnosis, 1246531 (7 April 2023); https://doi.org/10.1117/12.2654520

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available