Abstract:Deep learning methods for point tracking are applicable in 2D echocardiography, but do not yet take advantage of domain specifics that enable extremely fast and efficient configurations. We developed MyoTracker, a low-complexity architecture (0.3M parameters) for point tracking in echocardiography. It builds on the CoTracker2 architecture by simplifying its components and extending the temporal context to provide point predictions for the entire sequence in a single step. We applied MyoTracker to the right ventricular (RV) myocardium in RV-focused recordings and compared the results with those of CoTracker2 and EchoTracker, another specialized point tracking architecture for echocardiography. MyoTracker achieved the lowest average point trajectory error at 2.00 $\pm$ 0.53 mm. Calculating RV Free Wall Strain (RV FWS) using MyoTracker's point predictions resulted in a -0.3$\%$ bias with 95$\%$ limits of agreement from -6.1$\%$ to 5.4$\%$ compared to reference values from commercial software. This range falls within the interobserver variability reported in previous studies. The limits of agreement were wider for both CoTracker2 and EchoTracker, worse than the interobserver variability. At inference, MyoTracker used 67$\%$ less GPU memory than CoTracker2 and 84$\%$ less than EchoTracker on large sequences (100 frames). MyoTracker was 74 times faster during inference than CoTracker2 and 11 times faster than EchoTracker with our setup. Maintaining the entire sequence in the temporal context was the greatest contributor to MyoTracker's accuracy. Slight additional gains can be made by re-enabling iterative refinement, at the cost of longer processing time.
Abstract:Tissue tracking in echocardiography is challenging due to the complex cardiac motion and the inherent nature of ultrasound acquisitions. Although optical flow methods are considered state-of-the-art (SOTA), they struggle with long-range tracking, noise occlusions, and drift throughout the cardiac cycle. Recently, novel learning-based point tracking techniques have been introduced to tackle some of these issues. In this paper, we build upon these techniques and introduce EchoTracker, a two-fold coarse-to-fine model that facilitates the tracking of queried points on a tissue surface across ultrasound image sequences. The architecture contains a preliminary coarse initialization of the trajectories, followed by reinforcement iterations based on fine-grained appearance changes. It is efficient, light, and can run on mid-range GPUs. Experiments demonstrate that the model outperforms SOTA methods, with an average position accuracy of 67% and a median trajectory error of 2.86 pixels. Furthermore, we show a relative improvement of 25% when using our model to calculate the global longitudinal strain (GLS) in a clinical test-retest dataset compared to other methods. This implies that learning-based point tracking can potentially improve performance and yield a higher diagnostic and prognostic value for clinical measurements than current techniques. Our source code is available at: https://github.com/riponazad/echotracker/.
Abstract:Today ship hull inspection including the examination of the external coating, detection of defects, and other types of external degradation such as corrosion and marine growth is conducted underwater by means of Remotely Operated Vehicles (ROVs). The inspection process consists of a manual video analysis which is a time-consuming and labor-intensive process. To address this, we propose an automatic video analysis system using deep learning and computer vision to improve upon existing methods that only consider spatial information on individual frames in underwater ship hull video inspection. By exploring the benefits of adding temporal information and analyzing frame-based classifiers, we propose a multi-label video classification model that exploits the self-attention mechanism of transformers to capture spatiotemporal attention in consecutive video frames. Our proposed method has demonstrated promising results and can serve as a benchmark for future research and development in underwater video inspection applications.