Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dena Bazazian

Hyperbolic Metric Learning for Visual Outlier Detection

Mar 22, 2024

Alvaro Gonzalez-Jimenez, Simone Lionetti, Dena Bazazian, Philippe Gottfrois, Fabian Gröger, Marc Pouly, Alexander Navarini

Figure 1 for Hyperbolic Metric Learning for Visual Outlier Detection

Figure 2 for Hyperbolic Metric Learning for Visual Outlier Detection

Figure 3 for Hyperbolic Metric Learning for Visual Outlier Detection

Figure 4 for Hyperbolic Metric Learning for Visual Outlier Detection

Abstract:Out-Of-Distribution (OOD) detection is critical to deploy deep learning models in safety-critical applications. However, the inherent hierarchical concept structure of visual data, which is instrumental to OOD detection, is often poorly captured by conventional methods based on Euclidean geometry. This work proposes a metric framework that leverages the strengths of Hyperbolic geometry for OOD detection. Inspired by previous works that refine the decision boundary for OOD data with synthetic outliers, we extend this method to Hyperbolic space. Interestingly, we find that synthetic outliers do not benefit OOD detection in Hyperbolic space as they do in Euclidean space. Furthermore we explore the relationship between OOD detection performance and Hyperbolic embedding dimension, addressing practical concerns in resource-constrained environments. Extensive experiments show that our framework improves the FPR95 for OOD detection from 22\% to 15\% and from 49% to 28% on CIFAR-10 and CIFAR-100 respectively compared to Euclidean methods.

Via

Access Paper or Ask Questions

GPr-Net: Geometric Prototypical Network for Point Cloud Few-Shot Learning

Apr 12, 2023

Tejas Anvekar, Dena Bazazian

Abstract:In the realm of 3D-computer vision applications, point cloud few-shot learning plays a critical role. However, it poses an arduous challenge due to the sparsity, irregularity, and unordered nature of the data. Current methods rely on complex local geometric extraction techniques such as convolution, graph, and attention mechanisms, along with extensive data-driven pre-training tasks. These approaches contradict the fundamental goal of few-shot learning, which is to facilitate efficient learning. To address this issue, we propose GPr-Net (Geometric Prototypical Network), a lightweight and computationally efficient geometric prototypical network that captures the intrinsic topology of point clouds and achieves superior performance. Our proposed method, IGI++ (Intrinsic Geometry Interpreter++) employs vector-based hand-crafted intrinsic geometry interpreters and Laplace vectors to extract and evaluate point cloud morphology, resulting in improved representations for FSL (Few-Shot Learning). Additionally, Laplace vectors enable the extraction of valuable features from point clouds with fewer points. To tackle the distribution drift challenge in few-shot metric learning, we leverage hyperbolic space and demonstrate that our approach handles intra and inter-class variance better than existing point cloud few-shot learning methods. Experimental results on the ModelNet40 dataset show that GPr-Net outperforms state-of-the-art methods in few-shot learning on point clouds, achieving utmost computational efficiency that is $170\times$ better than all existing works. The code is publicly available at https://github.com/TejasAnvekar/GPr-Net.

Via

Access Paper or Ask Questions

Dual-Domain Image Synthesis using Segmentation-Guided GAN

Apr 19, 2022

Dena Bazazian, Andrew Calway, Dima Damen

Figure 1 for Dual-Domain Image Synthesis using Segmentation-Guided GAN

Figure 2 for Dual-Domain Image Synthesis using Segmentation-Guided GAN

Figure 3 for Dual-Domain Image Synthesis using Segmentation-Guided GAN

Figure 4 for Dual-Domain Image Synthesis using Segmentation-Guided GAN

Abstract:We introduce a segmentation-guided approach to synthesise images that integrate features from two distinct domains. Images synthesised by our dual-domain model belong to one domain within the semantic mask, and to another in the rest of the image - smoothly integrated. We build on the successes of few-shot StyleGAN and single-shot semantic segmentation to minimise the amount of training required in utilising two domains. The method combines a few-shot cross-domain StyleGAN with a latent optimiser to achieve images containing features of two distinct domains. We use a segmentation-guided perceptual loss, which compares both pixel-level and activations between domain-specific and dual-domain synthetic images. Results demonstrate qualitatively and quantitatively that our model is capable of synthesising dual-domain images on a variety of objects (faces, horses, cats, cars), domains (natural, caricature, sketches) and part-based masks (eyes, nose, mouth, hair, car bonnet). The code is publicly available at: https://github.com/denabazazian/Dual-Domain-Synthesis.

* CVPR2022 Workshops. 14 pages, 19 figures

Via

Access Paper or Ask Questions

Riemannian Functional Map Synchronization for Probabilistic Partial Correspondence in Shape Networks

Nov 29, 2021

Faria Huq, Adrish Dey, Sahra Yusuf, Dena Bazazian, Tolga Birdal, Nina Miolane

Figure 1 for Riemannian Functional Map Synchronization for Probabilistic Partial Correspondence in Shape Networks

Figure 2 for Riemannian Functional Map Synchronization for Probabilistic Partial Correspondence in Shape Networks

Figure 3 for Riemannian Functional Map Synchronization for Probabilistic Partial Correspondence in Shape Networks

Figure 4 for Riemannian Functional Map Synchronization for Probabilistic Partial Correspondence in Shape Networks

Abstract:Functional maps are efficient representations of shape correspondences, that provide matching of real-valued functions between pairs of shapes. Functional maps can be modelled as elements of the Lie group $SO(n)$ for nearly isometric shapes. Synchronization can subsequently be employed to enforce cycle consistency between functional maps computed on a set of shapes, hereby enhancing the accuracy of the individual maps. There is an interest in developing synchronization methods that respect the geometric structure of $SO(n)$, while introducing a probabilistic framework to quantify the uncertainty associated with the synchronization results. This paper introduces a Bayesian probabilistic inference framework on $SO(n)$ for Riemannian synchronization of functional maps, performs a maximum-a-posteriori estimation of functional maps through synchronization and further deploys a Riemannian Markov-Chain Monte Carlo sampler for uncertainty quantification. Our experiments demonstrate that constraining the synchronization on the Riemannian manifold $SO(n)$ improves the estimation of the functional maps, while our Riemannian MCMC sampler provides for the first time an uncertainty quantification of the results.

* 8 pages

Via

Access Paper or Ask Questions

Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images

Sep 04, 2018

Dena Bazazian, Dimosthenis Karatzas, Andrew D. Bagdanov

Figure 1 for Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images

Figure 2 for Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images

Figure 3 for Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images

Abstract:Word spotting in natural scene images has many applications in scene understanding and visual assistance. We propose Soft-PHOC, an intermediate representation of images based on character probability maps. Our representation extends the concept of the Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We show how to use our descriptors for word spotting tasks in egocentric camera streams through an efficient text line proposal algorithm. This is based on the Hough Transform over character attribute maps followed by scoring using Dynamic Time Warping (DTW). We evaluate our results on ICDAR 2015 Challenge 4 dataset of incidental scene text captured by an egocentric camera.

* 4 pages, 3 figures, The Third International Workshop on Egocentric Perception, Interaction and Computing (EPIC) at ECCV2018

Via

Access Paper or Ask Questions

Improving Text Proposals for Scene Images with Fully Convolutional Networks

Feb 16, 2017

Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, Andrew D. Bagdanov

Figure 1 for Improving Text Proposals for Scene Images with Fully Convolutional Networks

Figure 2 for Improving Text Proposals for Scene Images with Fully Convolutional Networks

Figure 3 for Improving Text Proposals for Scene Images with Fully Convolutional Networks

Figure 4 for Improving Text Proposals for Scene Images with Fully Convolutional Networks

Abstract:Text Proposals have emerged as a class-dependent version of object proposals - efficient approaches to reduce the search space of possible text object locations in an image. Combined with strong word classifiers, text proposals currently yield top state of the art results in end-to-end scene text recognition. In this paper we propose an improvement over the original Text Proposals algorithm of Gomez and Karatzas (2016), combining it with Fully Convolutional Networks to improve the ranking of proposals. Results on the ICDAR RRC and the COCO-text datasets show superior performance over current state-of-the-art.

* 6 pages, 8 figures, International Conference on Pattern Recognition (ICPR) - DLPR (Deep Learning for Pattern Recognition) workshop

Via

Access Paper or Ask Questions