Abstract:Cell tracking is an omnipresent image analysis task in live-cell microscopy. It is similar to multiple object tracking (MOT), however, each frame contains hundreds of similar-looking objects that can divide, making it a challenging problem. Current state-of-the-art approaches follow the tracking-by-detection paradigm, i.e. first all cells are detected per frame and successively linked in a second step to form biologically consistent cell tracks. Linking is commonly solved via discrete optimization methods, which require manual tuning of hyperparameters for each dataset and are therefore cumbersome to use in practice. Here we propose Trackastra, a general purpose cell tracking approach that uses a simple transformer architecture to directly learn pairwise associations of cells within a temporal window from annotated data. Importantly, unlike existing transformer-based MOT pipelines, our learning architecture also accounts for dividing objects such as cells and allows for accurate tracking even with simple greedy linking, thus making strides towards removing the requirement for a complex linking step. The proposed architecture operates on the full spatio-temporal context of detections within a time window by avoiding the computational burden of processing dense images. We show that our tracking approach performs on par with or better than highly tuned state-of-the-art cell tracking algorithms for various biological datasets, such as bacteria, cell cultures and fluorescent particles. We provide code at https://github.com/weigertlab/trackastra.
Abstract:State-of-the-art object detection and segmentation methods for microscopy images rely on supervised machine learning, which requires laborious manual annotation of training data. Here we present a self-supervised method based on time arrow prediction pre-training that learns dense image representations from raw, unlabeled live-cell microscopy videos. Our method builds upon the task of predicting the correct order of time-flipped image regions via a single-image feature extractor and a subsequent time arrow prediction head. We show that the resulting dense representations capture inherently time-asymmetric biological processes such as cell divisions on a pixel-level. We furthermore demonstrate the utility of these representations on several live-cell microscopy datasets for detection and segmentation of dividing cells, as well as for cell state classification. Our method outperforms supervised methods, particularly when only limited ground truth annotations are available as is commonly the case in practice. We provide code at https://github.com/weigertlab/tarrow.