Comparison of CSPDarkNet53, CSPResNeXt-50, and EfficientNet-B0 Backbones on YOLO V4 as Object Detector

Marsa Mahasin; Irma Amelia Dewi

doi:10.52088/ijesty.v2i3.291

Comparison of CSPDarkNet53, CSPResNeXt-50, and EfficientNet-B0 Backbones on YOLO V4 as Object Detector

Marsa Mahasin, Irma Amelia Dewi

Abstract

YOLO v4 has a structure consisting of 3 parts: backbone, neck, and head. The backbone is a part of the YOLO v4 structure that serves as a feature extractor from the image; the backbone is also a convolutional neural network that can be replaced with another convolutional neural network. Many backbones are recommended by previous research, such as CSPDarkNet53, CSPResNeXt-50, and EfficientNet-B0. Therefore, research needs to be done to determine the effect of different backbones on the Â YOLO v4 model. One of the research objects that can be used is a microfossil. Research on the detection of microfossils is fundamental to assist paleontologists in knowing the species of microfossils as a determinant of rock age and distinguishing between similar microfossils. In this research, three backbones consisting of CSPDarkNet53, CSPResNeXt-50, and EfficientNet-B0 were used to train and detect image sets of 5 species of foraminiferal microfossils. The results were evaluated to determine the advantages of each backbone. There are a few metrics are that being used for evaluation, namely precision, recall, f1-score, average precision (AP), mean average precision (mAP), frames per second (FPS), and model size. As a result, the mean average precision (mAP) of the CSPDarkNet53 model reached 83.41%, the highest compared to CSPResNeXt-50 and EfficientNet-B0, which get a value of 81,00% and 81,76%. CSPResNeXt-50 model has a precision of 75.60%, recall of 81.10%, and f1-score of 78%. CSPDarkNet53 model also got the highest FPS value of 33.4FPS. However, the YOLO v4 model with the EfficientNet-B0 backbone is the lightest model, with only 156.8 MB.

Keywords

YOLO, CSPDarkNet53, CSPResNeXt-50, EfficientNet-B0, Microfossil

Full Text:

PDF

References

L. Tan, T. Huangfu, L. Wu, and W. Chen, â€œComparison of RetinaNet, SSD, and YOLO v3 for real-time pill identification.,â€ BMC Med. Inform. Decis. Mak., vol. 21, no. 1, p. 324, Nov. 2021, doi: 10.1186/s12911-021-01691-8.

A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, â€œYOLOv4: Optimal Speed and Accuracy of Object Detection,â€ CoRR, vol. abs/2004.1, 2020.

C. Gaucher and D. PoirÃ©, â€œChapter 4.3 Biostratigraphy,â€ in Neoproterozoic-Cambrian Tectonics, Global Change And Evolution: A Focus On South Western Gondwana, vol. 16, C. Gaucher, A. N. Sial, H. E. Frimmel, and G. P. Halverson, Eds. Elsevier, 2009, pp. 103â€“114.

R. P. de Lima et al., â€œConvolutional Neural Networks as an Aid to Biostratigraphy and Micropaleontology: A Test On Late Paleozoic Microfossils,â€ Palaios, vol. 35, pp. 391â€“402, 2020.

R. Marchant, M. Tetard, A. Pratiwi, M. Adebayo, and T. De Garidel-Thoron, â€œAutomated analysis of foraminifera fossil records by image classification using a convolutional neural network,â€ J. Micropalaeontology, vol. 39, no. 2, pp. 183â€“202, 2020, doi: 10.5194/jm-39-183-2020.

J. Redmon, S. K. Divvala, R. B. Girshick, and A. Farhadi, â€œYou Only Look Once: Unified, Real-Time Object Detection,â€ CoRR, vol. abs/1506.0, 2015.

K. He, X. Zhang, S. Ren, and J. Sun, â€œDeep Residual Learning for Image Recognition,â€ CoRR, vol. abs/1512.0, 2015.

C.-Y. Wang, H.-Y. M. Liao, I.-H. Yeh, Y.-H. Wu, P.-Y. Chen, and J.-W. Hsieh, â€œCSPNet: {A} New Backbone that can Enhance Learning Capability of {CNN},â€ CoRR, vol. abs/1911.1, 2019.

Z. Yao, Y. Cao, S. Zheng, G. Huang, and S. Lin, â€œCross-Iteration Batch Normalization,â€ CoRR, vol. abs/2002.0, 2020.

C.-J. Chou, J.-T. Chien, and H.-T. Chen, â€œSelf Adversarial Training for Human Pose Estimation,â€ CoRR, vol. abs/1707.02439, 2017, [Online]. Available: http://arxiv.org/abs/1707.02439.

D. Misra, â€œMish: {A} Self Regularized Non-Monotonic Neural Activation Function,â€ CoRR, vol. abs/1908.0, 2019.

G. Ghiasi, T.-Y. Lin, and Q. V Le, â€œDropBlock: {A} regularization method for convolutional networks,â€ CoRR, vol. abs/1810.1, 2018.

Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, â€œDistance-IoU Loss: Faster and Better Learning for Bounding Box Regression,â€ in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, pp. 12993â€“13000, doi: 10.1609/aaai.v34i07.6999.

K. He, X. Zhang, S. Ren, and J. Sun, â€œSpatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,â€ CoRR, vol. abs/1406.4, 2014.

S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, â€œPath Aggregation Network for Instance Segmentation,â€ CoRR, vol. abs/1803.0, 2018.

J. Redmon and A. Farhadi, â€œYOLOv3: An Incremental Improvement,â€ CoRR, vol. abs/1804.0, 2018.

J. Park, J. Baek, J. Kim, K. You, and K. Kim, â€œDeep Learning-Based Algal Detection Model Development Considering Field Application,â€ Water, vol. 14, no. 8, 2022, doi: 10.3390/w14081275.

M. Tan and Q. V Le, â€œEfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,â€ CoRR, vol. abs/1905.1, 2019.

J. Hu, L. Shen, and G. Sun, â€œSqueeze-and-Excitation Networks,â€ CoRR, vol. abs/1709.0, 2017, [Online]. Available: http://arxiv.org/abs/1709.01507.

P. Ramachandran, B. Zoph, and Q. V Le, â€œSearching for Activation Functions,â€ CoRR, vol. abs/1710.0, 2017, [Online]. Available: http://arxiv.org/abs/1710.05941.

DOI: https://doi.org/10.52088/ijesty.v2i3.291

Article Metrics

Abstract view : 191 times
PDF - 85 times

Refbacks

There are currently no refbacks.

International Journal of Engineering, Science and Information Technology (IJESTY) eISSN 2775-2674

Published by Department of Information Technology, Universitas Malikussaleh, Indonesia

Editor E-mail: journalijesty@gmail.com

The works in IJESTY Journal are licensed under a Creative Commons Attribution- 4.0 International Public License (CC - BY 4.0).

Username
Password
Remember me