DATA AUGMENTATION ANALYSIS OF VEHICLE DETECTION IN AERIAL IMAGES
Author affiliations
DOI:
https://doi.org/10.15625/1813-9663/18259Keywords:
drone, object detection, vehicle detection, data augmentationAbstract
Drones are increasingly used in various application domains including surveillance, agriculture, delivery, search and rescue missions. Object detection in aerial images (captured by drones) gradually gains more interest in computer vision community. However, research activities are still very few in this area due to numerous challenges such as top-view angle, small-scale object, diverse directions, and data imbalance. In this paper, we investigate different data augmentation techniques. Furthermore, we propose combining data augmentation methods to further enhance the performance of the state-of-the-art object detection methods. Extensive experiments on two datasets, namely, AERIAU, and XDUAV, demonstrate that the combination of random cropped and vertical flipped data boosts the performance of object detectors on aerial images.
Metrics
References
Xiongwei Wu, Doyen Sahoo, and Steven CH Hoi. “Recent advances in deep learning for object detection”. In: Neurocomputing 396 (2020), pp. 39–64. DOI: https://doi.org/10.1016/j.neucom.2020.01.085
K. He et al. “Deep Residual Learning for Image Recognition”. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, pp. 770–778. doi: 10. 1109/CVPR.2016.90. DOI: https://doi.org/10.1109/CVPR.2016.90
R. Girshick. “Fast R-CNN”. In: 2015 IEEE International Conference on Computer Vision (ICCV). 2015, pp. 1440–1448. doi: 10.1109/ICCV.2015.169.
Shaoqing Ren et al. “Faster R-CNN: Towards real-time object detection with region proposal networks”. In: Advances in neural information processing systems. 2015, pp. 91–99.
Kaiming He et al. “Mask r-cnn”. In: Proceedings of the IEEE international conference on computer vision. 2017, pp. 2961–2969.
Joseph Redmon et al. “You only look once: Unified, real-time object detection”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, pp. 779–788. DOI: https://doi.org/10.1109/CVPR.2016.91
Joseph Redmon and Ali Farhadi. “YOLO9000: Better, Faster, Stronger”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 2017. DOI: https://doi.org/10.1109/CVPR.2017.690
Wei Liu et al. “Ssd: Single shot multibox detector”. In: European conference on computer vision. Springer. 2016, pp. 21–37.
Tsung-Yi Lin et al. “Focal loss for dense object detection”. In: Proceedings of the IEEE international conference on computer vision. 2017, pp. 2980–2988.
Bharat Singh, Mahyar Najibi, and Larry S Davis. “SNIPER: Efficient Multi-Scale Training”. In: Advances in Neural Information Processing Systems. Ed. by S. Bengio et al. Vol. 31. Curran Associates, Inc., 2018, pp. 9310–9320. url: https://proceedings. neurips.cc/paper/2018/file/166cee72e93a992007a89b39eb29628b-Paper.pdf.
Khanh-Duy Nguyen et al. “YADA: you always dream again for better object detection”. In: Multimedia Tools and Applications 78.19 (2019), pp. 28189–28208. DOI: https://doi.org/10.1007/s11042-019-07888-4
Eduard Semsch et al. “Autonomous UAV surveillance in complex urban environments”. In: 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology. Vol. 2. IEEE. 2009, pp. 82–85. DOI: https://doi.org/10.1109/WI-IAT.2009.132
Praveen Kumar Reddy Maddikunta et al. “Unmanned Aerial Vehicles in Smart Agriculture: Applications, Requirements, and Challenges”. In: IEEE Sensors Journal (2021). DOI: https://doi.org/10.1109/JSEN.2021.3049471
Maxime Perreault and Kamran Behdinan. “Delivery Drone Driving Cycle”. In: IEEE Transactions on Vehicular Technology (2021). DOI: https://doi.org/10.1109/TVT.2021.3053536
Milan Erdelj et al. “Help from the sky: Leveraging UAVs for disaster management”. In: IEEE Pervasive Computing 16.1 (2017), pp. 24–32. DOI: https://doi.org/10.1109/MPRV.2017.11
Gui-Song Xia et al. “DOTA: A large-scale dataset for object detection in aerial images”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, pp. 3974–3983.
Pengfei et al Zhu. “Visdrone-det2018: The vision meets drone object detection in image challenge results”. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 2018, pp. 0–0.
Sangmin Oh et al. “A large-scale benchmark dataset for event recognition in surveillance video”. In: CVPR 2011. IEEE. 2011, pp. 3153–3160.
Mohammadamin Barekatain et al. “Okutama-action: An aerial view video dataset for concurrent human action detection”. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2017, pp. 28–35. DOI: https://doi.org/10.1109/CVPRW.2017.267
Vincenzo Carletti et al. “Multi-object tracking by flying cameras based on a forwardbackward interaction”. In: IEEE Access 6 (2018), pp. 43905–43919. DOI: https://doi.org/10.1109/ACCESS.2018.2864672
Heng Fan et al. “VisDrone-MOT2020: The Vision Meets Drone Multiple Object Tracking Challenge Results”. In: European Conference on Computer Vision. Springer. 2020, pp. 713–727. DOI: https://doi.org/10.1007/978-3-030-66823-5_43
Isha Kalra et al. “Dronesurf: Benchmark dataset for drone-based face recognition”. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019). IEEE. 2019, pp. 1–7. DOI: https://doi.org/10.1109/FG.2019.8756593
Robert Krajewski et al. “The highd dataset: A drone dataset of naturalistic vehicle trajectories on german highways for validation of highly automated driving systems”. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE. 2018, pp. 2118–2125. DOI: https://doi.org/10.1109/ITSC.2018.8569552
P. -Y. Chen et al. “Drone-Based Vehicle Flow Estimation and its Application to Traffic Conflict Hotspot Detection at Intersections”. In: 2020 IEEE International Conference on Image Processing (ICIP). 2020, pp. 1521–1525. doi: 10.1109/ICIP40778.2020. 9190890. DOI: https://doi.org/10.1109/ICIP40778.2020.9190890
J. Deng et al. “ImageNet: A Large-Scale Hierarchical Image Database”. In: CVPR09. 2009. [26] Tsung-Yi Lin et al. “Microsoft COCO: Common Objects in Context”. In: European Conference on Computer Vision. 2014, pp. 740–755.
M. Everingham et al. “The Pascal Visual Object Classes (VOC) Challenge”. In: International Journal of Computer Vision 88.2 (June 2010), pp. 303–338.
Tsung-Yi Lin et al. “Microsoft coco: Common objects in context”. In: European conference on computer vision. Springer. 2014, pp. 740–755. DOI: https://doi.org/10.1007/978-3-319-10602-1_48
Ross Girshick. “Fast r-cnn”. In: Proceedings of the IEEE international conference on computer vision. 2015, pp. 1440–1448. DOI: https://doi.org/10.1109/ICCV.2015.169
Jasper RR Uijlings et al. “Selective search for object recognition”. In: International journal of computer vision 104.2 (2013), pp. 154–171. DOI: https://doi.org/10.1007/s11263-013-0620-5
C Lawrence Zitnick and Piotr Dollár. “Edge boxes: Locating object proposals from edges”. In: European conference on computer vision. Springer. 2014, pp. 391–405. DOI: https://doi.org/10.1007/978-3-319-10602-1_26
N. Dalal and B. Triggs. “Histograms of Oriented Gradients for Human Detection”. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). Vol. 1. 2005, 886–893 vol. 1. doi: 10.1109/CVPR.2005.177. DOI: https://doi.org/10.1109/CVPR.2005.177
Joseph Redmon and Ali Farhadi. “YOLOv3: An incremental improvement”. In: arXiv preprint arXiv:1804.02767 (2018).
Wei Liu et al. “SSD: Single shot multibox detector”. In: European conference on computer vision. Springer. 2016, pp. 21–37. DOI: https://doi.org/10.1007/978-3-319-46448-0_2
Mark Everingham et al. “The pascal visual object classes (voc) challenge”. In: International journal of computer vision 88.2 (2010), pp. 303–338. DOI: https://doi.org/10.1007/s11263-009-0275-4
Haoran Wang et al. “Spatial attention for multi-scale feature refinement for object detection”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019, pp. 0–0. DOI: https://doi.org/10.1109/ICCVW.2019.00014
Junyi Zhang et al. “How to fully exploit the abilities of aerial image detectors”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019, pp. 0–0. DOI: https://doi.org/10.1109/ICCVW.2019.00007
Dheeraj Reddy Pailla. “VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results”. In: (2019).
Zhaowei Cai and Nuno Vasconcelos. “Cascade r-cnn: Delving into high quality object detection”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018, pp. 6154–6162.
Du D. et al. “VisDrone-DET2020: The Vision Meets Drone Object Detection in Image Challenge Results”. In: (2020).
Fan Yang et al. “Clustered object detection in aerial images”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, pp. 8311–8320. DOI: https://doi.org/10.1109/ICCV.2019.00840
Changlin Li et al. “Density map guided object detection in aerial images”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020, pp. 190–191.
Jian Ding et al. “Learning roi transformer for oriented object detection in aerial images”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 2849–2858. DOI: https://doi.org/10.1109/CVPR.2019.00296
Quynh M Chung et al. “Data Augmentation Analysis in Vehicle Detection from Aerial Videos”. In: 2020 RIVF International Conference on Computing and Communication Technologies (RIVF). IEEE. 2020, pp. 1–3. DOI: https://doi.org/10.1109/RIVF48685.2020.9140740
Xie, X and Yang W and Cao, G and Yang, J and Shi, G. The Collected XDUAV Dataset. https://share.weiyun.com/8rAu3kqr, Last accessed on 2020-02-13. 2018.
Alex Clark. Pillow (PIL Fork) Documentation. 2015. url: https : / / buildmedia . readthedocs.org/media/pdf/pillow/latest/pillow.pdf.
Zhun Zhong et al. “Random erasing data augmentation”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. 07. 2020, pp. 13001–13008. DOI: https://doi.org/10.1609/aaai.v34i07.7000
Connor Shorten and Taghi M Khoshgoftaar. “A survey on image data augmentation for deep learning”. In: Journal of Big Data 6.1 (2019), pp. 1–48. DOI: https://doi.org/10.1186/s40537-019-0197-0
Sungeun Hong, Sungil Kang, and Donghyeon Cho. “Patch-level augmentation for object detection in aerial images”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019, pp. 0–0. DOI: https://doi.org/10.1109/ICCVW.2019.00021
Xindi Zhang, Ebroul Izquierdo, and Krishna Chandramouli. “Dense and small object detection in uav vision based on cascade network”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019, pp. 0–0. DOI: https://doi.org/10.1109/ICCVW.2019.00020
Chenchen Zhu, Yihui He, and Marios Savvides. “Feature Selective Anchor-Free Module for Single-Shot Object Detection”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019, pp. 840–849.
Xiang Li et al. “Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection”. In: arXiv preprint arXiv:2006.04388 (2020).
Shifeng Zhang et al. “Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00978
Kai Chen et al. “MMDetection: Open mmlab detection toolbox and benchmark”. In: arXiv preprint arXiv:1906.07155 (2019).
Jianxiu Yang et al. “A Feature-Enhanced Anchor-Free Network for UAV Vehicle Detection”. In: Remote Sensing 12.17 (2020), p. 2729. DOI: https://doi.org/10.3390/rs12172729
Downloads
Published
How to Cite
Issue
Section
License
1. We hereby assign copyright of our article (the Work) in all forms of media, whether now known or hereafter developed, to the Journal of Computer Science and Cybernetics. We understand that the Journal of Computer Science and Cybernetics will act on my/our behalf to publish, reproduce, distribute and transmit the Work.2. This assignment of copyright to the Journal of Computer Science and Cybernetics is done so on the understanding that permission from the Journal of Computer Science and Cybernetics is not required for me/us to reproduce, republish or distribute copies of the Work in whole or in part. We will ensure that all such copies carry a notice of copyright ownership and reference to the original journal publication.
3. We warrant that the Work is our results and has not been published before in its current or a substantially similar form and is not under consideration for another publication, does not contain any unlawful statements and does not infringe any existing copyright.
4. We also warrant that We have obtained the necessary permission from the copyright holder/s to reproduce in the article any materials including tables, diagrams or photographs not owned by me/us.