Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Werman

CanonNet: Canonical Ordering and Curvature Learning for Point Cloud Analysis

Apr 03, 2025

Benjy Friedmann, Michael Werman

Abstract:Point cloud processing poses two fundamental challenges: establishing consistent point ordering and effectively learning fine-grained geometric features. Current architectures rely on complex operations that limit expressivity while struggling to capture detailed surface geometry. We present CanonNet, a lightweight neural network composed of two complementary components: (1) a preprocessing pipeline that creates a canonical point ordering and orientation, and (2) a geometric learning framework where networks learn from synthetic surfaces with precise curvature values. This modular approach eliminates the need for complex transformation-invariant architectures while effectively capturing local geometric properties. Our experiments demonstrate state-of-the-art performance in curvature estimation and competitive results in geometric descriptor tasks with significantly fewer parameters (\textbf{100X}) than comparable methods. CanonNet's efficiency makes it particularly suitable for real-world applications where computational resources are limited, demonstrating that mathematical preprocessing can effectively complement neural architectures for point cloud analysis. The code for the project is publicly available \hyperlink{https://benjyfri.github.io/CanonNet/}{https://benjyfri.github.io/CanonNet/}.

Via

Access Paper or Ask Questions

The Fibonacci Network: A Simple Alternative for Positional Encoding

Nov 07, 2024

Yair Bleiberg, Michael Werman

Abstract:Coordinate-based Multi-Layer Perceptrons (MLPs) are known to have difficulty reconstructing high frequencies of the training data. A common solution to this problem is Positional Encoding (PE), which has become quite popular. However, PE has drawbacks. It has high-frequency artifacts and adds another hyper-hyperparameter, just like batch normalization and dropout do. We believe that under certain circumstances PE is not necessary, and a smarter construction of the network architecture together with a smart training method is sufficient to achieve similar results. In this paper, we show that very simple MLPs can quite easily output a frequency when given input of the half-frequency and quarter-frequency. Using this, we design a network architecture in blocks, where the input to each block is the output of the two previous blocks along with the original input. We call this a {\it Fibonacci Network}. By training each block on the corresponding frequencies of the signal, we show that Fibonacci Networks can reconstruct arbitrarily high frequencies.

Via

Access Paper or Ask Questions

Camera Calibration and Stereo via a Single Image of a Spherical Mirror

Sep 24, 2024

Nissim Barzilay, Ofek Narinsky, Michael Werman

Figure 1 for Camera Calibration and Stereo via a Single Image of a Spherical Mirror

Figure 2 for Camera Calibration and Stereo via a Single Image of a Spherical Mirror

Figure 3 for Camera Calibration and Stereo via a Single Image of a Spherical Mirror

Figure 4 for Camera Calibration and Stereo via a Single Image of a Spherical Mirror

Abstract:This paper presents a novel technique for camera calibration using a single view that incorporates a spherical mirror. Leveraging the distinct characteristics of the sphere's contour visible in the image and its reflections, we showcase the effectiveness of our method in achieving precise calibration. Furthermore, the reflection from the mirrored surface provides additional information about the surrounding scene beyond the image frame. Our method paves the way for the development of simple catadioptric stereo systems. We explore the challenges and opportunities associated with employing a single mirrored sphere, highlighting the potential applications of this setup in practical scenarios. The paper delves into the intricacies of the geometry and calibration procedures involved in catadioptric stereo utilizing a spherical mirror. Experimental results, encompassing both synthetic and real-world data, are presented to illustrate the feasibility and accuracy of our approach.

* 12 pages, 11 figures

Via

Access Paper or Ask Questions

Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Oct 03, 2023

Yoav Arad, Michael Werman

Figure 1 for Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Figure 2 for Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Figure 3 for Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Figure 4 for Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Abstract:Video Anomaly Detection (VAD) plays a crucial role in modern surveillance systems, aiming to identify various anomalies in real-world situations. However, current benchmark datasets predominantly emphasize simple, single-frame anomalies such as novel object detection. This narrow focus restricts the advancement of VAD models. In this research, we advocate for an expansion of VAD investigations to encompass intricate anomalies that extend beyond conventional benchmark boundaries. To facilitate this, we introduce two datasets, HMDB-AD and HMDB-Violence, to challenge models with diverse action-based anomalies. These datasets are derived from the HMDB51 action recognition dataset. We further present Multi-Frame Anomaly Detection (MFAD), a novel method built upon the AI-VAD framework. AI-VAD utilizes single-frame features such as pose estimation and deep image encoding, and two-frame features such as object velocity. They then apply a density estimation algorithm to compute anomaly scores. To address complex multi-frame anomalies, we add a deep video encoding features capturing long-range temporal dependencies, and logistic regression to enhance final score calculation. Experimental results confirm our assumptions, highlighting existing models limitations with new anomaly types. MFAD excels in both simple and complex anomaly detection scenarios.

Via

Access Paper or Ask Questions

Robust affine feature matching via quadratic assignment on Grassmannians

Mar 07, 2023

Alexander Kolpakov, Michael Werman

Abstract:GraNNI (Grassmannians for Nearest Neighbours Identification) a new algorithm to solve the problem of affine registration is proposed. The algorithm is based on the Grassmannian of $k$--dimensional planes in $\mathbb{R}^n$ and minimizing the Frobenius norm between the two elements of the Grassmannian. The Quadratic Assignment Problem (QAP) is used to find the matching. The results of the experiments show that the algorithm is more robust to noise and point discrepancy in point clouds than previous approaches.

* 12 pages, 18 figures; GitHub repository at (https://github.com/sashakolpakov/granni)

Via

Access Paper or Ask Questions

An approach to robust ICP initialization

Dec 10, 2022

Alexander Kolpakov, Michael Werman

Abstract:In this note, we propose an approach for initializing the Iterative Closest Point (ICP) algorithm that allows us to apply ICP to unlabelled point clouds that are related by rigid transformations. We also give bounds on the robustness of our approach to noise. Numerical experiments confirm our theoretical findings.

* 7 pages, 10 figures; GitHub repository at (https://github.com/sashakolpakov/icp-init)

Via

Access Paper or Ask Questions

DecisioNet: A Binary-Tree Structured Neural Network

Jul 10, 2022

Noam Gottlieb, Michael Werman

Figure 1 for DecisioNet: A Binary-Tree Structured Neural Network

Figure 2 for DecisioNet: A Binary-Tree Structured Neural Network

Figure 3 for DecisioNet: A Binary-Tree Structured Neural Network

Figure 4 for DecisioNet: A Binary-Tree Structured Neural Network

Abstract:Deep neural networks (DNNs) and decision trees (DTs) are both state-of-the-art classifiers. DNNs perform well due to their representational learning capabilities, while DTs are computationally efficient as they perform inference along one route (root-to-leaf) that is dependent on the input data. In this paper, we present DecisioNet (DN), a binary-tree structured neural network. We propose a systematic way to convert an existing DNN into a DN to create a lightweight version of the original model. DecisioNet takes the best of both worlds - it uses neural modules to perform representational learning and utilizes its tree structure to perform only a portion of the computations. We evaluate various DN architectures, along with their corresponding baseline models on the FashionMNIST, CIFAR10, and CIFAR100 datasets. We show that the DN variants achieve similar accuracy while significantly reducing the computational cost of the original network.

* This paper is under review of a conference, hence the code may not be published yet. It will be publicly available on github after the paper is published

Via

Access Paper or Ask Questions

Fully Convolutional Fractional Scaling

Mar 20, 2022

Michael Soloveitchik, Michael Werman

Figure 1 for Fully Convolutional Fractional Scaling

Figure 2 for Fully Convolutional Fractional Scaling

Figure 3 for Fully Convolutional Fractional Scaling

Figure 4 for Fully Convolutional Fractional Scaling

Abstract:We introduce a fully convolutional fractional scaling component, FCFS. Fully convolutional networks can be applied to any size input and previously did not support non-integer scaling. Our architecture is simple with an efficient single layer implementation. Examples and code implementations of three common scaling methods are published.

Via

Access Paper or Ask Questions

On a realization of motion and similarity group equivalence classes of labeled points in $\mathbb R^k$ with applications to computer vision

Mar 24, 2021

Steven B. Damelin, David L. Ragozin, Michael Werman

Abstract:We study a realization of motion and similarity group equivalence classes of $n\geq 1$ labeled points in $\mathbb R^k,\, k\geq 1$ as a metric space with a computable metric. Our study is motivated by applications in computer vision.

Via

Access Paper or Ask Questions

Cameras Viewing Cameras Geometry

Nov 28, 2019

Danail Brezov, Michael Werman

Abstract:A basic problem in computer vision is to understand the structure of a real-world scene given several images of it. Here we study several theoretical aspects of the intra multi-view geometry of calibrated cameras when all that they can reliably recognize is each other. With the proliferation of wearable cameras, autonomous vehicles and drones, the geometry of these multiple cameras is a timely and relevant problem to study.

Via

Access Paper or Ask Questions