渔业现代化杂志

• 论文 • 上一篇    下一篇

水族馆鱼类目标检测网络优化研究

  1. (1上海海洋大学信息学院,上海 201306;
    2中国水产科学研究院东海水产研究所,农业农村部渔业遥感重点实验室,上海 200090;
    3 上海峻鼎渔业科技有限公司,上海 200090;
    4 武汉纺织大学经济学院,湖北,武汉 430200)
  • 出版日期:2022-06-20 发布日期:2022-10-28
  • 通讯作者: 张胜茂(1976 — ),男,研究员,博士,研究方向:渔业数据挖掘、鱼类图像分析等。E-mail: ryshengmao@126.com
  • 作者简介:刘洋(1996—),男,硕士研究生,研究方向:鱼类图像分析、计算机视觉等。E-mail:yangzai126@126.com
  • 基金资助:
    国家自然科学基金重点项目(61936014);中国水产科学研究院基本科研业务费(2020TD82);农村农业部渔业渔政管理局项目(17200020);

Research on optimization of aquarium fish target detection network

  1. (1 College of Information, Shanghai Ocean University, shanghai, 201306, China;
    2 Key Laboratory of Fisheries Remote Sensing Ministry of Agriculture and Rural Affairs; East China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Shanghai 200090, China;
    3 Shanghai Junding Fishery Science and Technology Co. Ltd., Shanghai, 200090, China;
    4 Wuhan Textile University, Wuhan, 430200, Hubei, China)
  • Online:2022-06-20 Published:2022-10-28

摘要: 基于深度学习的目标检测与识别在渔业养殖的应用中,存在数据集质量低,网络计算复杂度高、推理速度慢等问题,不易满足高实时性应用场景。本研究收集并标注83种水族馆鱼类的10 042张图像数据集,在保证目标检测和识别能力、降低网络计算复杂度、提高推理速度的情况下,探究网络的优化方法。利用“深度可分离卷积”重新设计YoloV4网络的主干网络(backbone)对比Mixup、Cutmix、 Mosaic等不同数据增强方式及Mish、Swish、 ELU等不同激活函数对网络的优化效果。根据对比结果择优选择数据增强方式、激活函数的组合用于优化网络。结果显示:根据该方法优化的网络在测试集上的预测精确率达到94.37%,计算复杂度(BFLOPS)仅为5.47, 相较YoloV4降低了93.99%。研究表明,本研究优化网络的方法,能够在保证检测与识别精确率的前提下,大幅降低网络计算复杂度、提高推理速度,为高实时性应用场景的鱼类目标检测与识别提供了参考。

关键词: 目标检测, 目标识别, 深度学习, Yolo网络, 数据增强, 网络优化

Abstract: The application of target detection and recognition based on deep learning in aquaculture has problems such as low data set quality, high network computing complexity, and slow inference speed, which is not easy to meet high real-time application scenarios. This study collected and annotated 10,042 image datasets of 83 species of aquarium fish, and then explored network optimization methods on the basis of ensuring target detection and recognition capabilities, reducing network computational complexity, and improving inference speed. Redesign the backbone network of YoloV4 network using "depth separable convolution" and compare the optimization effects of different data enhancement methods such as Mixup, Cutmix, Mosaic, Mish, Swish, and ELU on the network. According to the comparison results, the combination of data enhancement method and activation function is selected to optimize the network. The results show that the prediction accuracy of the network optimized by this method on the test set reaches 94.37%, and the computational complexity (BFLOPS) is only 5.47, 93.99% lower than that of YoloV4. The result shows that the method of optimizing the network in this study can greatly reduce the computational complexity of the network and improve the inference speed on the premise of ensuring the accuracy of detection and recognition, which provides a reference for fish target detection and recognition in high real-time application scenarios.

Key words: Target detection, target identification, yolo network, data augmentation, network optimization.