Jisuanji kexue yu tansuo (Mar 2025)
Small Object Detection Based on Enhanced Feature Pyramid and Focal-AIoU Loss
Abstract
Unmanned aerial vehicle (UAV) aerial images have characteristics such as small target scale and complex backgrounds, making it difficult to achieve satisfactory recognition accuracy using generic object detection methods directly on these types of images. Based on YOLOv8, this paper proposes a small object detection model called CFE-YOLO (cross-level feature-fusion enhanced-YOLO), which incorporates a feature enhancement network and a localized focal loss. Firstly, a cross-level feature-fusion enhanced pyramid network (CFEPN) is designed to improve the traditional feature pyramid structure by fusing attention feature maps. This is achieved by adding high-resolution feature maps from shallow networks and removing deep detection heads to adapt to the requirements of small object detection. Secondly, a focus loss function based on area intersection over union is designed by combining Complete-IOU and Focal loss function ideas. It is used to further improve the detection of small objects. Finally, a lightweight spatial pyramid pooling layer module is implemented by introducing depth-wise separable convolutions, maintaining the detection accuracy of the model while reducing the parameter count. Extensive experiments conducted on the UAV datasets VisDrone and Tinyperson show that CFE-YOLO improves the mAP0.50 by 4.72 and 5.58 percentage points respectively compared with the baseline, while reducing the parameter count by 37.74%. Furthermore, it achieves higher accuracy compared with other advanced algorithms.
Keywords