Abstract:Can we detect an object that is not visible in an image? This study introduces the novel task of 2D and 3D unobserved object detection for predicting the location of objects that are occluded or lie outside the image frame. We adapt several state-of-the-art pre-trained generative models to solve this task, including 2D and 3D diffusion models and vision--language models, and show that they can be used to infer the presence of objects that are not directly observed. To benchmark this task, we propose a suite of metrics that captures different aspects of performance. Our empirical evaluations on indoor scenes from the RealEstate10k dataset with COCO object categories demonstrate results that motivate the use of generative models for the unobserved object detection task. The current work presents a promising step towards compelling applications like visual search and probabilistic planning that can leverage object detection beyond what can be directly observed.
Abstract:In this paper, we analyse the performance of the closed-loop Whiplash gradient descent algorithm for L-smooth convex cost functions. Using numerical experiments, we study the algorithm's performance for convex cost functions, for different condition numbers. We analyse the convergence of the momentum sequence using symplectic integration and introduce the concept of relaxation sequences which analyses the non-classical character of the whiplash method. Under the additional assumption of invexity, we establish a momentum-driven adaptive convergence rate. Furthermore, we introduce an energy method for predicting the convergence rate with convex cost functions for closed-loop inertial gradient dynamics, using an integral anchored energy function and a novel lower bound asymptotic notation, by exploiting the bounded nature of the solutions. Using this, we establish a polynomial convergence rate for the whiplash inertial gradient system, for a family of scalar quadratic cost functions and an exponential rate for a quadratic scalar cost function.