Abstract:Despite great progress, text-driven long video editing is still notoriously challenging mainly due to excessive memory overhead. Although recent efforts have simplified this task into a two-step process of keyframe translation and interpolation generation, the token-wise keyframe translation still plagues the upper limit of video length. In this paper, we propose a novel and training-free approach towards efficient and effective long video editing, termed AdaFlow. We first reveal that not all tokens of video frames hold equal importance for keyframe translation, based on which we propose an Adaptive Attention Slimming scheme for AdaFlow to squeeze the $KV$ sequence, thus increasing the number of keyframes for translations by an order of magnitude. In addition, an Adaptive Keyframe Selection scheme is also equipped to select the representative frames for joint editing, further improving generation quality. With these innovative designs, AdaFlow achieves high-quality long video editing of minutes in one inference, i.e., more than 1$k$ frames on one A800 GPU, which is about ten times longer than the compared methods, e.g., TokenFlow. To validate AdaFlow, we also build a new benchmark for long video editing with high-quality annotations, termed LongV-EVAL. Our code is released at: https://github.com/jidantang55/AdaFlow.
Abstract:Data augmentation is a promising technique for unsupervised anomaly detection in industrial applications, where the availability of positive samples is often limited due to factors such as commercial competition and sample collection difficulties. In this paper, how to effectively select and apply data augmentation methods for unsupervised anomaly detection is studied. The impact of various data augmentation methods on different anomaly detection algorithms is systematically investigated through experiments. The experimental results show that the performance of different industrial image anomaly detection (termed as IAD) algorithms is not significantly affected by the specific data augmentation method employed and that combining multiple data augmentation methods does not necessarily yield further improvements in the accuracy of anomaly detection, although it can achieve excellent results on specific methods. These findings provide useful guidance on selecting appropriate data augmentation methods for different requirements in IAD.
Abstract:Retrospectively gated cine (retro-cine) MRI is the clinical standard for cardiac functional analysis. Deep learning (DL) based methods have been proposed for the reconstruction of highly undersampled MRI data and show superior image quality and magnitude faster reconstruction time than CS-based methods. Nevertheless, it remains unclear whether DL reconstruction is suitable for cardiac function analysis. To address this question, in this study we evaluate and compare the cardiac functional values (EDV, ESV and EF for LV and RV, respectively) obtained from highly accelerated MRI acquisition using DL based reconstruction algorithm (DL-cine) with values from CS-cine and conventional retro-cine. To the best of our knowledge, this is the first work to evaluate the cine MRI with deep learning reconstruction for cardiac function analysis and compare it with other conventional methods. The cardiac functional values obtained from cine MRI with deep learning reconstruction are consistent with values from clinical standard retro-cine MRI.