A detection model LSD-YOLO based on improved YOLOv11n is proposed is proposed to address the problem of underwater aquaculture fish due to occlusion, image degradation, and difficulty in realizing accurate tracking of the fish. Firstly, a DynamicHead is introduced to give the model the ability to fuse task awareness, scale awareness, and spatial awareness. Second, a lightweight feature extraction module, LiteODSE, has been designed to combine dynamic convolution and channel attention to enhance the feature extraction capability in the backbone network. Then, the SDI multilevel feature fusion module is introduced, which can separate and fuse multi-scale spatial information. Moreover, the GIOU loss function is used instead of the CIOU loss function, and the difficult localization problem under small targets as well as non-overlapping regions can be improved by introducing the constraint information outside the bounding box. Finally, tracking accuracy is effectively improved by combining it with StrongSORT, which is currently a more advanced tracking algorithm. Experiments demonstrate that the accuracy of the designed model is improved by 3.2% and mAP50 by 3% compared with YOLOv11n. Compared with YOLOv11n+StrongSORT, the MOTA is improved by 5.2% and the number of ID switching is reduced by 30%, which proves that the improved method can be better applied in target detection and tracking of underwater farmed fish.