Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Justin Kay

Counting Fish with Temporal Representations of Sonar Video

Feb 07, 2025

Kai Van Brunt, Justin Kay, Timm Haucke, Pietro Perona, Grant Van Horn, Sara Beery

Figure 1 for Counting Fish with Temporal Representations of Sonar Video

Figure 2 for Counting Fish with Temporal Representations of Sonar Video

Figure 3 for Counting Fish with Temporal Representations of Sonar Video

Figure 4 for Counting Fish with Temporal Representations of Sonar Video

Abstract:Accurate estimates of salmon escapement - the number of fish migrating upstream to spawn - are key data for conservation and fishery management. Existing methods for salmon counting using high-resolution imaging sonar hardware are non-invasive and compatible with computer vision processing. Prior work in this area has utilized object detection and tracking based methods for automated salmon counting. However, these techniques remain inaccessible to many sonar deployment sites due to limited compute and connectivity in the field. We propose an alternative lightweight computer vision method for fish counting based on analyzing echograms - temporal representations that compress several hundred frames of imaging sonar video into a single image. We predict upstream and downstream counts within 200-frame time windows directly from echograms using a ResNet-18 model, and propose a set of domain-specific image augmentations and a weakly-supervised training protocol to further improve results. We achieve a count error of 23% on representative data from the Kenai River in Alaska, demonstrating the feasibility of our approach.

* ECCV 2024. 6 pages, 2 figures

Via

Access Paper or Ask Questions

Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Mar 18, 2024

Justin Kay, Timm Haucke, Suzanne Stathatos, Siqi Deng, Erik Young, Pietro Perona, Sara Beery, Grant Van Horn

Figure 1 for Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Figure 2 for Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Figure 3 for Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Figure 4 for Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Abstract:Object detectors often perform poorly on data that differs from their training set. Domain adaptive object detection (DAOD) methods have recently demonstrated strong results on addressing this challenge. Unfortunately, we identify systemic benchmarking pitfalls that call past results into question and hamper further progress: (a) Overestimation of performance due to underpowered baselines, (b) Inconsistent implementation practices preventing transparent comparisons of methods, and (c) Lack of generality due to outdated backbones and lack of diversity in benchmarks. We address these problems by introducing: (1) A unified benchmarking and implementation framework, Align and Distill (ALDI), enabling comparison of DAOD methods and supporting future development, (2) A fair and modern training and evaluation protocol for DAOD that addresses benchmarking pitfalls, (3) A new DAOD benchmark dataset, CFC-DAOD, enabling evaluation on diverse real-world data, and (4) A new method, ALDI++, that achieves state-of-the-art results by a large margin. ALDI++ outperforms the previous state-of-the-art by +3.5 AP50 on Cityscapes to Foggy Cityscapes, +5.7 AP50 on Sim10k to Cityscapes (where ours is the only method to outperform a fair baseline), and +2.0 AP50 on CFC Kenai to Channel. Our framework, dataset, and state-of-the-art method offer a critical reset for DAOD and provide a strong foundation for future research. Code and data are available: https://github.com/justinkay/aldi and https://github.com/visipedia/caltech-fish-counting.

* 30 pages, 10 figures

Via

Access Paper or Ask Questions

Teaching Computer Vision for Ecology

Jan 05, 2023

Elijah Cole, Suzanne Stathatos, Björn Lütjens, Tarun Sharma, Justin Kay, Jason Parham, Benjamin Kellenberger, Sara Beery

Figure 1 for Teaching Computer Vision for Ecology

Figure 2 for Teaching Computer Vision for Ecology

Figure 3 for Teaching Computer Vision for Ecology

Abstract:Computer vision can accelerate ecology research by automating the analysis of raw imagery from sensors like camera traps, drones, and satellites. However, computer vision is an emerging discipline that is rarely taught to ecologists. This work discusses our experience teaching a diverse group of ecologists to prototype and evaluate computer vision systems in the context of an intensive hands-on summer workshop. We explain the workshop structure, discuss common challenges, and propose best practices. This document is intended for computer scientists who teach computer vision across disciplines, but it may also be useful to ecologists or other domain experts who are learning to use computer vision themselves.

Via

Access Paper or Ask Questions

The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

Jul 19, 2022

Justin Kay, Peter Kulits, Suzanne Stathatos, Siqi Deng, Erik Young, Sara Beery, Grant Van Horn, Pietro Perona

Figure 1 for The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

Figure 2 for The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

Figure 3 for The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

Figure 4 for The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

Abstract:We present the Caltech Fish Counting Dataset (CFC), a large-scale dataset for detecting, tracking, and counting fish in sonar videos. We identify sonar videos as a rich source of data for advancing low signal-to-noise computer vision applications and tackling domain generalization in multiple-object tracking (MOT) and counting. In comparison to existing MOT and counting datasets, which are largely restricted to videos of people and vehicles in cities, CFC is sourced from a natural-world domain where targets are not easily resolvable and appearance features cannot be easily leveraged for target re-identification. With over half a million annotations in over 1,500 videos sourced from seven different sonar cameras, CFC allows researchers to train MOT and counting algorithms and evaluate generalization performance at unseen test locations. We perform extensive baseline experiments and identify key challenges and opportunities for advancing the state of the art in generalization in MOT and counting.

* ECCV 2022. 33 pages, 12 figures

Via

Access Paper or Ask Questions

Fine-Grained Counting with Crowd-Sourced Supervision

May 30, 2022

Justin Kay, Catherine M. Foley, Tom Hart

Figure 1 for Fine-Grained Counting with Crowd-Sourced Supervision

Figure 2 for Fine-Grained Counting with Crowd-Sourced Supervision

Figure 3 for Fine-Grained Counting with Crowd-Sourced Supervision

Figure 4 for Fine-Grained Counting with Crowd-Sourced Supervision

Abstract:Crowd-sourcing is an increasingly popular tool for image analysis in animal ecology. Computer vision methods that can utilize crowd-sourced annotations can help scale up analysis further. In this work we study the potential to do so on the challenging task of fine-grained counting. As opposed to the standard crowd counting task, fine-grained counting also involves classifying attributes of individuals in dense crowds. We introduce a new dataset from animal ecology to enable this study that contains 1.7M crowd-sourced annotations of 8 fine-grained classes. It is the largest available dataset for fine-grained counting and the first to enable the study of the task with crowd-sourced annotations. We introduce methods for generating aggregate "ground truths" from the collected annotations, as well as a counting method that can utilize the aggregate information. Our method improves results by 8% over a comparable baseline, indicating the potential for algorithms to learn fine-grained counting using crowd-sourced supervision.

* In Computer Vision for Animal Behavior Tracking and Modeling Workshop at CVPR 2022. 4 pages, 3 figures

Via

Access Paper or Ask Questions

The Fishnet Open Images Database: A Dataset for Fish Detection and Fine-Grained Categorization in Fisheries

Jun 16, 2021

Justin Kay, Matt Merrifield

Figure 1 for The Fishnet Open Images Database: A Dataset for Fish Detection and Fine-Grained Categorization in Fisheries

Figure 2 for The Fishnet Open Images Database: A Dataset for Fish Detection and Fine-Grained Categorization in Fisheries

Figure 3 for The Fishnet Open Images Database: A Dataset for Fish Detection and Fine-Grained Categorization in Fisheries

Figure 4 for The Fishnet Open Images Database: A Dataset for Fish Detection and Fine-Grained Categorization in Fisheries

Abstract:Camera-based electronic monitoring (EM) systems are increasingly being deployed onboard commercial fishing vessels to collect essential data for fisheries management and regulation. These systems generate large quantities of video data which must be reviewed on land by human experts. Computer vision can assist this process by automatically detecting and classifying fish species, however the lack of existing public data in this domain has hindered progress. To address this, we present the Fishnet Open Images Database, a large dataset of EM imagery for fish detection and fine-grained categorization onboard commercial fishing vessels. The dataset consists of 86,029 images containing 34 object classes, making it the largest and most diverse public dataset of fisheries EM imagery to-date. It includes many of the characteristic challenges of EM data: visual similarity between species, skewed class distributions, harsh weather conditions, and chaotic crew activity. We evaluate the performance of existing detection and classification algorithms and demonstrate that the dataset can serve as a challenging benchmark for development of computer vision algorithms in fisheries. The dataset is available at https://www.fishnet.ai/.

* In 8th Workshop on Fine-Grained Visual Categorization at CVPR 2021

Via

Access Paper or Ask Questions