Phung, K.-P. and Le, N.-L. and Nguyen, Q.-U. (2023) SuperYOLO8: Enhancing Performance of Object Detection in Real-Time Multi-Modal Remote Sensing Imagery through SuperYOLO and YOLOv8 ∗. In: UNSPECIFIED.
Full text not available from this repository. (Upload)Abstract
Object detection in remote sensing images (RSI) poses a significant challenge, particularly in accurately detecting small objects across different scales and rotations. State-of-the-art object detection solutions often rely on complex deep neural networks, leading to high computational costs. In this paper, we propose a novel approach to enhance the performance of the YOLO series on RSI by combining the strengths of SuperYOLO, based on YOLOv5, and YOLOv8. Our approach leverages the multi-modal data fusion capabilities of SuperYOLO to extract complementary information from diverse data sources while incorporating key advancements from YOLOv8. First, to improve efficiency, we introduce architectural modifications, we replaced the Conv6x6 layer with a 3×3 layer and substituted the C3 layer of SuperYOLO with the C2f layer of YOLOv8. These modifications aim to combine high-level features with contextual information and enhance its detection capabilities. Additionally, we propose the use of soft-NMS instead of non-maximum suppression (NMS) during the post-processing stage of the model. Soft-Nmsoffers improved object localization and reduces duplicate detections, thereby enhancing overall detection accuracy. Experimental evaluations conducted on the widely used VEDAI RS dataset affirm the effectiveness of our approach in achieving a good balance between accuracy and computational efficiency compared to state-of-the-art YOLO variants. © 2023 IEEE.
Item Type: | Conference or Workshop Item (UNSPECIFIED) |
---|---|
Divisions: | Offices > Office of International Cooperation |
Identification Number: | 10.1109/RIVF60135.2023.10471841 |
Uncontrolled Keywords: | Computational efficiency; Data fusion; Deep neural networks; Image enhancement; Modal analysis; Object recognition; Remote sensing, Deep learning; Multi-modal; Multi-modal fusion; Non-maximum suppression; Objects detection; Performance; Real- time; Remote sensing images; State of the art; YOLO, Object detection |
Additional Information: | Conference of 2023 RIVF International Conference on Computing and Communication Technologies, RIVF 2023 ; Conference Date: 23 December 2023 Through 25 December 2023; Conference Code:198353 |
URI: | http://eprints.lqdtu.edu.vn/id/eprint/11209 |