3D Object Detection based on Semi-Supervised Learning in Complex Traffic Environments
Abstract
Based on 3D Point Cloud Object Detection, this paper optimizes the Pillar Feature Net through a semi-supervised learning approach. This enhancement improves the model's ability to supervise and utilize unlabeled data, thereby equipping it with greater data comprehension capabilities and bolstering its adaptability to complex real-world scenarios. Additionally, the study employs a full attention feature representation encoder provided by the Transformer framework, followed by the substitution of VGG-16 with MobilenetV3. This reduction in model complexity accelerates the achievement of desired outcomes, making the model more suitable for real-time or resource-constrained scenarios. The optimization methods utilized in this paper not only improve the accuracy and efficiency of object detection but also positively impact the model's generalization ability and deployment practicality. Validation on the KITTI 3D public dataset results in an AP value of 81.77% for the hard detection difficulty level in the test set. Compared to the 75.46% achieved by the PointPillars model, the proposed method achieves an improvement of 6.31%.

