Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patrick Takenaka

ViPro-2: Unsupervised State Estimation via Integrated Dynamics for Guiding Video Prediction

Aug 08, 2025

Patrick Takenaka, Johannes Maucher, Marco F. Huber

Abstract:Predicting future video frames is a challenging task with many downstream applications. Previous work has shown that procedural knowledge enables deep models for complex dynamical settings, however their model ViPro assumed a given ground truth initial symbolic state. We show that this approach led to the model learning a shortcut that does not actually connect the observed environment with the predicted symbolic state, resulting in the inability to estimate states given an observation if previous states are noisy. In this work, we add several improvements to ViPro that enables the model to correctly infer states from observations without providing a full ground truth state in the beginning. We show that this is possible in an unsupervised manner, and extend the original Orbits dataset with a 3D variant to close the gap to real world scenarios.

* Published in 2025 International Joint Conference on Neural Networks (IJCNN)

Via

Access Paper or Ask Questions

Anonymization of Documents for Law Enforcement with Machine Learning

Jan 13, 2025

Manuel Eberhardinger, Patrick Takenaka, Daniel Grießhaber, Johannes Maucher

Figure 1 for Anonymization of Documents for Law Enforcement with Machine Learning

Figure 2 for Anonymization of Documents for Law Enforcement with Machine Learning

Figure 3 for Anonymization of Documents for Law Enforcement with Machine Learning

Figure 4 for Anonymization of Documents for Law Enforcement with Machine Learning

Abstract:The steadily increasing utilization of data-driven methods and approaches in areas that handle sensitive personal information such as in law enforcement mandates an ever increasing effort in these institutions to comply with data protection guidelines. In this work, we present a system for automatically anonymizing images of scanned documents, reducing manual effort while ensuring data protection compliance. Our method considers the viability of further forensic processing after anonymization by minimizing automatically redacted areas by combining automatic detection of sensitive regions with knowledge from a manually anonymized reference document. Using a self-supervised image model for instance retrieval of the reference document, our approach requires only one anonymized example to efficiently redact all documents of the same type, significantly reducing processing time. We show that our approach outperforms both a purely automatic redaction system and also a naive copy-paste scheme of the reference anonymization to other documents on a hand-crafted dataset of ground truth redactions.

* Accepted at IEEE Symposium on CI in Security, Defence and Biometrics 2025 (IEEE CISDB)

Via

Access Paper or Ask Questions

Guiding Video Prediction with Explicit Procedural Knowledge

Jun 26, 2024

Patrick Takenaka, Johannes Maucher, Marco F. Huber

Figure 1 for Guiding Video Prediction with Explicit Procedural Knowledge

Figure 2 for Guiding Video Prediction with Explicit Procedural Knowledge

Figure 3 for Guiding Video Prediction with Explicit Procedural Knowledge

Figure 4 for Guiding Video Prediction with Explicit Procedural Knowledge

Abstract:We propose a general way to integrate procedural knowledge of a domain into deep learning models. We apply it to the case of video prediction, building on top of object-centric deep models and show that this leads to a better performance than using data-driven models alone. We develop an architecture that facilitates latent space disentanglement in order to use the integrated procedural knowledge, and establish a setup that allows the model to learn the procedural interface in the latent space using the downstream task of video prediction. We contrast the performance to a state-of-the-art data-driven approach and show that problems where purely data-driven approaches struggle can be handled by using knowledge about the domain, providing an alternative to simply collecting more data.

* 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Paris, France, 2023, pp. 1076-1084
* Published in 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

Via

Access Paper or Ask Questions