Paper
21 July 2023 Enhancing use of BERT information in neural machine translation with masking-BERT attention
Author Affiliations +
Proceedings Volume 12717, 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023); 1271733 (2023) https://doi.org/10.1117/12.2684653
Event: 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023), 2023, Wuhan, China
Abstract
BERT has shown remarkable performance in several natural language processing tasks, but it fails to exhibit the same high performance in cross-lingual tasks, particularly machine translation. To address this issue, we propose a BERT-enhanced neural machine translation (BE-NMT) model that optimizes the use of the information contained in BERT by NMT. Our proposed model comprises three components: (1) A MASKING strategy to mitigate knowledge forgetting caused by fine-tuning of BERT on the NMT task; (2) Serial and parallel processing of multi-attention models for incorporating BERT into the NMT system; (3) Fusing multiple hidden layer outputs of BERT to supplement the missing linguistic information of its final hidden layer output. We conducted experiments on several translation tasks, and our proposed model notably outperforms the strong baseline by improving 1.93 BLEU points on the United Nations Parallel Corpus English→Chinese task. Additionally, our model also achieves remarkable performance on other translation tasks.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xi Chen, Lin Wu, and Yuanhao Zhang "Enhancing use of BERT information in neural machine translation with masking-BERT attention", Proc. SPIE 12717, 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023), 1271733 (21 July 2023); https://doi.org/10.1117/12.2684653
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Performance modeling

Transformers

Data hiding

Information fusion

Associative arrays

Data modeling

Back to Top