Abstract:We provide finite-particle convergence rates for the Stein Variational Gradient Descent (SVGD) algorithm in the Kernel Stein Discrepancy ($\mathsf{KSD}$) and Wasserstein-2 metrics. Our key insight is the observation that the time derivative of the relative entropy between the joint density of $N$ particle locations and the $N$-fold product target measure, starting from a regular initial distribution, splits into a dominant `negative part' proportional to $N$ times the expected $\mathsf{KSD}^2$ and a smaller `positive part'. This observation leads to $\mathsf{KSD}$ rates of order $1/\sqrt{N}$, providing a near optimal double exponential improvement over the recent result by~\cite{shi2024finite}. Under mild assumptions on the kernel and potential, these bounds also grow linearly in the dimension $d$. By adding a bilinear component to the kernel, the above approach is used to further obtain Wasserstein-2 convergence. For the case of `bilinear + Mat\'ern' kernels, we derive Wasserstein-2 rates that exhibit a curse-of-dimensionality similar to the i.i.d. setting. We also obtain marginal convergence and long-time propagation of chaos results for the time-averaged particle laws.
Abstract:In this paper, we propose a novel framework for multi-image co-segmentation using class agnostic meta-learning strategy by generalizing to new classes given only a small number of training samples for each new class. We have developed a novel encoder-decoder network termed as DVICE (Directed Variational Inference Cross Encoder), which learns a continuous embedding space to ensure better similarity learning. We employ a combination of the proposed DVICE network and a novel few-shot learning approach to tackle the small sample size problem encountered in co-segmentation with small datasets like iCoseg and MSRC. Furthermore, the proposed framework does not use any semantic class labels and is entirely class agnostic. Through exhaustive experimentation over multiple datasets using only a small volume of training data, we have demonstrated that our approach outperforms all existing state-of-the-art techniques.
Abstract:Text detection in scenes based on deep neural networks have shown promising results. Instead of using word bounding box regression, recent state-of-the-art methods have started focusing on character bounding box and pixel-level prediction. This necessitates the need to link adjacent characters, which we propose in this paper using a novel Graph Neural Network (GNN) architecture that allows us to learn both node and edge features as opposed to only the node features under the typical GNN. The main advantage of using GNN for link prediction lies in its ability to connect characters which are spatially separated and have an arbitrary orientation. We show our concept on the well known SynthText dataset, achieving top results as compared to state-of-the-art methods.
Abstract:In recent years, fingerprint recognition systems have made remarkable advancements in the field of biometric security as it plays an important role in personal, national and global security. In spite of all these notable advancements, the fingerprint recognition technology is still susceptible to spoof attacks which can significantly jeopardize the user security. The cross sensor and cross material spoof detection still pose a challenge with a myriad of spoof materials emerging every day, compromising sensor interoperability and robustness. This paper proposes a novel method for fingerprint spoof detection using both global and local fingerprint feature descriptors. These descriptors are extracted using DenseNet which significantly improves cross-sensor, cross-material and cross-dataset performance. A novel patch attention network is used for finding the most discriminative patches and also for network fusion. We evaluate our method on four publicly available datasets:LivDet 2011, 2013, 2015 and 2017. A set of comprehensive experiments are carried out to evaluate cross-sensor, cross-material and cross-dataset performance over these datasets. The proposed approach achieves an average accuracy of 99.52%, 99.16% and 99.72% on LivDet 2017,2015 and 2011 respectively outperforming the current state-of-the-art results by 3% and 4% for LivDet 2015 and 2011 respectively.