Abstract:High-frequency sonar systems deployed on a broad variety of ocean observing platforms are creating a deluge of water column sonar data at unprecedented speed from all corners of the ocean. Efficient and integrative analysis of these data, either across different sonar instruments or with other oceanographic datasets, holds the key to monitoring and understanding the response of marine organisms to the rapidly changing environments. In this paper we present echopype, an open-source Python software library designed to address this need. By standardizing water column sonar data from diverse instrument sources following a community convention and utilizing the widely embraced netCDF data model to encode sonar data as labeled, multi-dimensional arrays, echopype facilitates intuitive, user-friendly exploration and use of sonar data in an instrument-agnostic manner. Through leveraging existing open-source Python libraries optimized for distributed computing, echopype directly enables computational interoperability and scalability in both local and cloud computing environments. Echopype's modularized package structure further provides a conceptually unified implementation framework for expanding its support for additional instrument raw data formats and incorporating new data analysis and visualization functionalities. We envision the continued development of echopype as a catalyst for making information derived from water column sonar data an integrated component of regional and global ocean observation strategies.
Abstract:Echosounders are high-frequency sonar systems widely used to observe mid-trophic level animals in the ocean. The recent deluge of echosounder data from diverse ocean observing platforms has created unprecedented opportunities to study the marine ecosystems at broad scales. However, there is a critical lack of methods capable of automatic and adaptive extraction of ecologically relevant spatio-temporal structures from echosounder observation, limiting effective and wider use of these rich datasets in marine ecological research. Here we present a data-driven methodology based on matrix decomposition that builds a compact representation of long-term echosounder time series using intrinsic features in the data, and demonstrate its utility by analyzing an example multi-frequency dataset from the northeast Pacific Ocean. We show that Principal Component Pursuit (PCP) successfully removes noise interference from the data, and that a temporally smooth Nonnegative Matrix Factorization (tsNMF) automatically discovers a small number of distinct daily echogram patterns, whose time-varying linear combination (activation) reconstructs the dominant structures in the original time series. This low-rank representation is more tractable and interpretable than the original time series. It is also suitable for visualization and systematic analysis with other ocean variables such as currents. Unlike existing echo analysis methods that rely on fixed, handcrafted rules, the data-driven and thus adaptable nature of our methodology is well-suited for analyzing data collected from unfamiliar ecosystems or ecosystems undergoing rapid changes in the changing climate. Future developments and applications based on this work will catalyze advancements in marine ecology by providing robust time series analytics for large-scale, acoustics-based biological observation in the ocean.