Paper
23 May 2023 Text to multi-object images synthesis based on non-local self-attention
Pengxiong Wang, Wu Yang
Author Affiliations +
Proceedings Volume 12645, International Conference on Computer, Artificial Intelligence, and Control Engineering (CAICE 2023); 126451D (2023) https://doi.org/10.1117/12.2681140
Event: International Conference on Computer, Artificial Intelligence, and Control Engineering (CAICE 2023), 2023, Hangzhou, China
Abstract
Deep Convolutional Network (CNN) can make the pictures generated by GAN more reasonable, but limited by the local receptive field of CNN, there are still many unreasonable places in the authenticity and semantics of the multi-object images generated from text. Therefore, a GAN-based method that incorporates a non-local self-attention mechanism is proposed. By embedding a non-local self-attention structure in the network, the network obtains global semantic information and detailed features, and uses the obtained information to perform level-by-level encoding to generate the final relatively reasonable image. The amount of parameters and calculation of the entire model is also reduced a lot. The proposed method is verified on the public COCO-stuff dataset and uses multiple indicators such as Inception Score, FID and classification accuracy score to evaluate the authenticity and diversity of the generated images. Experimental results show that the quality of the generated images is superior to that of previously proposed methods.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Pengxiong Wang and Wu Yang "Text to multi-object images synthesis based on non-local self-attention", Proc. SPIE 12645, International Conference on Computer, Artificial Intelligence, and Control Engineering (CAICE 2023), 126451D (23 May 2023); https://doi.org/10.1117/12.2681140
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Semantics

Image processing

Design and modelling

Data modeling

Education and training

Process modeling

Image quality

Back to Top