This paper focuses on high-precision 3D object detector. In two-stage 3D object detection algorithms based on point-voxel method, the concatenation of multiple types and scales of features is often limited to simple concatenation, which fails to capture rich semantic information. To better utilize multi-scale voxel features, raw point feature and BEV feature, we introduce a method called hybrid cross feature fusion module (HCFF). This method leverages a gating module to fuse raw point feature, multi-scale voxel features, and BEV feature, resulting in more comprehensive semantic feature information, which improves the accuracy of object detection in all classes in KITTI. It is especially good at smaller classes. In the end, we enhance the original PV-RCNN and propose a higher-precision point-voxel based 3Dobject detection algorithm. The results demonstrate that our HCFF-Det3D network outperforms PV-RCNN in terms of accuracy.
|