Abstract
Clean energy from oceans and rivers is becoming a reality with the development of new technologies like tidal and instream turbines that generate electricity from naturally flowing water. These new technologies are being monitored for effects on fish and other wildlife using underwater video. Methods for automated analysis of underwater video are needed to lower the costs of analysis and improve accuracy. A deep learning model, YOLO, was trained to recognize fish in underwater video using three very different datasets recorded at real-world water power sites. Training and testing with examples from all three datasets resulted in a mean average precision (mAP) score of 0.5392. To test how well a model could generalize to new datasets, the model was trained using examples from only two of the datasets and then tested on examples from all three datasets. The resulting model could not recognize fish in the dataset that was not part of the training set. The mAP scores on the other two datasets that were included in the training set were higher than the scores achieved by the model trained on all three datasets. These results indicate that different methods are needed in order to produce a trained model that can generalize to new data sets such as those encountered in real world applications.