Semantic space captioner: generating image captions step by step

Chenhao Zhu; Xia Ye; Qiduo Lu

doi:10.1117/1.JEI.31.6.063021

17 November 2022 Semantic space captioner: generating image captions step by step

Chenhao Zhu, Xia Ye, Qiduo Lu

Author Affiliations +

Journal of Electronic Imaging, Vol. 31, Issue 6, 063021 (November 2022). https://doi.org/10.1117/1.JEI.31.6.063021

Abstract

Image captioning is a popular research direction at the intersection of machine vision and natural language processing. Most of the existing image captioning methods adopt an encoder–decoder-like structure in which the image is encoded and fed into a decoder to generate a paragraph describing the image content. Although the existing methods have achieved great results in describing natural images, there is still much room for improvement in describing details. We propose the semantic space captioner model to introduce the concept of dense captioning into image captioning using contrastive language-image pretraining as an encoder for text and images. Dense captions are generated for image regions and are used as an extra semantic space for decoding to enhance the final caption. According to the experimental results, our model outperforms existing methods in generalizing image details and is able to generate diverse and meaningful captions. It also performs well on the MSCOCO dataset-related metrics scores.

Citation Download Citation

Chenhao Zhu, Xia Ye, and Qiduo Lu "Semantic space captioner: generating image captions step by step," Journal of Electronic Imaging 31(6), 063021 (17 November 2022). https://doi.org/10.1117/1.JEI.31.6.063021

Received: 26 June 2022; Accepted: 27 October 2022; Published: 17 November 2022

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available