AN EFFECTIVE DEEP LEARNING MODEL FOR RECOGNITION OF ANIMALS AND PLANTS
Author affiliations
DOI:
https://doi.org/10.15625/1813-9663/38/1/16309Keywords:
Deep learning models, Classification losses, Feature pyramid networkAbstract
This paper presents a deep learning model to address the problem of recognition of animals and plants. The context of this work is to make an effort in protection of rare species that are seriously faced to the risk of extinction in Vietnam such as Panthera pardus, Dalbergia cochinchinensis, Macaca mulatta. The proposed approach exploits the advanced learning ability of convolutional neural networks and Inception residual structures to design a lightweight model for classification task. We also apply the transfer learning technique to fine-tune the two state-of-the-art methods, MobileNetV2 and InceptionV3, specific to our own dataset. Experimental results demonstrate the superiority of our object predictor (e.g., 95.8% accuracy) in comparison with other methods. In addition, the proposed model works very efficiently with the inference speed of around 113 FPS on a CPU machine, enabling it for deployment on mobile environment.
Metrics
References
[1] A. Berg, J. Deng, and L. Fei-Fei, “Large scale visual recognition challenge 2010,” 2010. [Online]. Available: www.imagenet.org/challenges
[2] L. G. Hafemann, L. S. Oliveira, and P. Cavalin, “Forest species recognition using deep convolutional neural networks,” in 2014 22nd International Conference on Pattern Recognition, 2014, pp. 1103–1107. DOI: https://doi.org/10.1109/ICPR.2014.199
[3] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. DOI: https://doi.org/10.1109/CVPR.2016.90
[4] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv:1207.0580 [cs.NE], 2012.
[5] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv:1704.04861 [cs.CV], 2017.
[6] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and ¡0.5mb model size,” arXiv:1602.07360 [cs.CV], 2016.
[7] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv:1502.03167 [cs.CV], 2015.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds., vol. 25. Curran Associates, Inc., 2012.
[9] Z. Miao, K. M. Gaynor, J. Wang, Z. Liu, O. Muellerklein, M. S. Norouzzadeh, A. McInturff, R. C. K. Bowie, R. Nathan, S. X. Yu, and W. M. Getz, “Insights and approaches using deep learning to classify wildlife,” Scientific Reports, vol. 9, no. 1, pp. 1–9, 2019. DOI: https://doi.org/10.1038/s41598-019-44565-w
[10] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th International Conference on International Conference on Machine Learning, ser. ICML’10, 2010, p. 807–814.
[11] T.-A. Pham, “Semantic convolutional features for face detection,” Machine Vision and Applications, vol. 33, no. 3, pp. 1–18, 2021. [Online]. Available: https: //doi.org/10.1007/s00138-021-01245-y DOI: https://doi.org/10.1007/s00138-021-01245-y
[12] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “Imagenet large scale visual recognition challenge,” International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, DOI: https://doi.org/10.1007/s11263-015-0816-y
[13] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” arXiv:1801.04381 [cs.CV], 2019. DOI: https://doi.org/10.1109/CVPR.2018.00474
[14] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556 [cs.CV], 09 2014.
[15] C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1–9. DOI: https://doi.org/10.1109/CVPR.2015.7298594
[16] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, ser. AAAI’17. AAAI Press, 2017, pp. 4278—-4284. DOI: https://doi.org/10.1609/aaai.v31i1.11231
[17] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” arXiv:1512.00567 [cs.CV], 2015. DOI: https://doi.org/10.1109/CVPR.2016.308
[18] M. Willi, R. T. Pitman, A. W. Cardoso, C. Locke, A. Swanson, A. Boyer, M. Veldthuis, and L. Fortson, “Identifying animal species in camera trap images using deep learning and citizen science,” Methods in Ecology and Evolution, vol. 10, no. 1, pp. 80–91, 2019. DOI: https://doi.org/10.1111/2041-210X.13099
[19] S. Zhang, X.Wang, Z. Lei, and S. Z. Li, “Faceboxes: A cpu real-time and accurate unconstrained face detector,” Neurocomputing, vol. 364, pp. 297–309, 2019. DOI: https://doi.org/10.1016/j.neucom.2019.07.064
[20] X. Zhang, X. Zhou, M. Lin, and J. Sun, “Shufflenet: An extremely efficient convolutional neural network for mobile devices,” arXiv:1707.01083 [cs.CV], 2017. DOI: https://doi.org/10.1109/CVPR.2018.00716
Downloads
Published
How to Cite
Issue
Section
License
1. We hereby assign copyright of our article (the Work) in all forms of media, whether now known or hereafter developed, to the Journal of Computer Science and Cybernetics. We understand that the Journal of Computer Science and Cybernetics will act on my/our behalf to publish, reproduce, distribute and transmit the Work.2. This assignment of copyright to the Journal of Computer Science and Cybernetics is done so on the understanding that permission from the Journal of Computer Science and Cybernetics is not required for me/us to reproduce, republish or distribute copies of the Work in whole or in part. We will ensure that all such copies carry a notice of copyright ownership and reference to the original journal publication.
3. We warrant that the Work is our results and has not been published before in its current or a substantially similar form and is not under consideration for another publication, does not contain any unlawful statements and does not infringe any existing copyright.
4. We also warrant that We have obtained the necessary permission from the copyright holder/s to reproduce in the article any materials including tables, diagrams or photographs not owned by me/us.