Paper
8 June 2024 Allreduce algorithm optimization of OpenMPI communication library
Guangyao Zhang, Wei Wan, Junhong Li
Author Affiliations +
Proceedings Volume 13171, Third International Conference on Algorithms, Microchips, and Network Applications (AMNA 2024); 1317106 (2024) https://doi.org/10.1117/12.3031959
Event: 3rd International Conference on Algorithms, Microchips and Network Applications (AMNA 2024), 2024, Jinan, China
Abstract
MPI (Message Passing Interface) plays a crucial role in the field of parallel computing. In the Allreduce algorithm of the OpenMPI communication library, there are some issues in handling communication scenarios with a number of processes that is non-power-of-two. The two existing algorithms address this by excluding some processes to achieve a power-of-two process count. However, the consideration factors are too simplistic, resulting in an imbalanced distribution of participating processes on nodes, greatly impacting communication efficiency. To address this problem, the layout of processes on nodes is taken into consideration, and the range of excluded processes is redefined. Both algorithms are subjected to generic load balancing optimizations and adaptations for domestic architectures, resulting in improved load balancing. Experimental results show that, under a communication scale of 16 nodes, the recursive_doubling algorithm achieves performance improvements of up to 30%, while the reduce_scatter_allgather algorithm achieves performance improvements of up to 21%.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Guangyao Zhang, Wei Wan, and Junhong Li "Allreduce algorithm optimization of OpenMPI communication library", Proc. SPIE 13171, Third International Conference on Algorithms, Microchips, and Network Applications (AMNA 2024), 1317106 (8 June 2024); https://doi.org/10.1117/12.3031959
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Dysprosium

Mathematical optimization

Data processing

Parallel computing

Data communications

Telecommunications

Logic

Back to Top