The existing 3D target detection network based on feature layer fusion of multi-view images and lidar point cloud fusion is mostly fused by directly splicing the multi-sensor features output by the backbone or the BEV features under the unified perspective of the two modalities. The features obtained by this method will be affected by the original data feature modality conversion and multi-sensor feature fusion`s effect. Aiming at this problem, a 3D object detection network based on feature fusion based on channel attention is proposed to improve the feature aggregation ability of BEV feature fusion, thereby improving the representation ability of the fused features. The experimental results on the nuScenes open source dataset show that compared with the baseline network, the overall feature grasp of the object is increased, and the average orientation error and average speed error are reduced by 4.9% and 4.0%, respectively. In the process of automatic driving, It can improve the vehicle's ability to perceive moving obstacles on the road, which has certain practical value.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.