Paper
17 January 2025 SparseNetYOLOv8: integrating vision transformers and dynamic probing for enhanced sparse object detection
Lurui Wang, Yanfeng Lu
Author Affiliations +
Proceedings Volume 13521, International Conference on Computer Vision and Image Processing (CVIP 2024); 135210S (2025) https://doi.org/10.1117/12.3058039
Event: 2024 International Conference on Computer Vision and Image Processing, 2024, Hangzhou, China
Abstract
In this study, we propose SparseNetYOLOv8, an improved YOLOv8n model for small-object detection in sparse space. The main changes to the original architecture were incorporating GhostConv, CBAM, MobileViTBlock, and DWC within the Backbone, and also BiFPN and DySnakeConv in the Neck for improved feature fusion and edge detection respectively. These improvements together give a 10.6% mAP@0.5 then a mAP@0.5-0.95 increase of 7.8% and SAHI in Head which minimizes target omissions. These enhancements collectively yield a 10.6% improvement in mAP@0.5 and a 7.8% increase in mAP@0.5-0.95, while SAHI in the Head minimizes target omissions. Experimental results demonstrate the robustness of SparseNetYOLOv8 in comparison with other YOLO variants.
(2025) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Lurui Wang and Yanfeng Lu "SparseNetYOLOv8: integrating vision transformers and dynamic probing for enhanced sparse object detection", Proc. SPIE 13521, International Conference on Computer Vision and Image Processing (CVIP 2024), 135210S (17 January 2025); https://doi.org/10.1117/12.3058039
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Convolution

Performance modeling

Feature extraction

Head

Neck

Target detection

Back to Top