Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexandru Paul Condurache

Efficient Data Driven Mixture-of-Expert Extraction from Trained Networks

May 21, 2025

Uranik Berisha, Jens Mehnert, Alexandru Paul Condurache

Abstract:Vision Transformers have emerged as the state-of-the-art models in various Computer Vision tasks, but their high computational and resource demands pose significant challenges. While Mixture-of-Experts (MoE) can make these models more efficient, they often require costly retraining or even training from scratch. Recent developments aim to reduce these computational costs by leveraging pretrained networks. These have been shown to produce sparse activation patterns in the Multi-Layer Perceptrons (MLPs) of the encoder blocks, allowing for conditional activation of only relevant subnetworks for each sample. Building on this idea, we propose a new method to construct MoE variants from pretrained models. Our approach extracts expert subnetworks from the model's MLP layers post-training in two phases. First, we cluster output activations to identify distinct activation patterns. In the second phase, we use these clusters to extract the corresponding subnetworks responsible for producing them. On ImageNet-1k recognition tasks, we demonstrate that these extracted experts can perform surprisingly well out of the box and require only minimal fine-tuning to regain 98% of the original performance, all while reducing MACs and model size, by up to 36% and 32% respectively.

* Accepted at IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025

Via

Access Paper or Ask Questions

PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs

May 06, 2025

Lukas Meiner, Jens Mehnert, Alexandru Paul Condurache

Abstract:Convolutional neural networks (CNNs) are crucial for computer vision tasks on resource-constrained devices. Quantization effectively compresses these models, reducing storage size and energy cost. However, in modern depthwise-separable architectures, the computational cost is distributed unevenly across its components, with pointwise operations being the most expensive. By applying a general quantization scheme to this imbalanced cost distribution, existing quantization approaches fail to fully exploit potential efficiency gains. To this end, we introduce PROM, a straightforward approach for quantizing modern depthwise-separable convolutional networks by selectively using two distinct bit-widths. Specifically, pointwise convolutions are quantized to ternary weights, while the remaining modules use 8-bit weights, which is achieved through a simple quantization-aware training procedure. Additionally, by quantizing activations to 8-bit, our method transforms pointwise convolutions with ternary weights into int8 additions, which enjoy broad support across hardware platforms and effectively eliminates the need for expensive multiplications. Applying PROM to MobileNetV2 reduces the model's energy cost by more than an order of magnitude (23.9x) and its storage size by 2.7x compared to the float16 baseline while retaining similar classification performance on ImageNet. Our method advances the Pareto frontier for energy consumption vs. top-1 accuracy for quantized convolutional models on ImageNet. PROM addresses the challenges of quantizing depthwise-separable convolutional networks to both ternary and 8-bit weights, offering a simple way to reduce energy cost and storage size.

Via

Access Paper or Ask Questions

FLARES: Fast and Accurate LiDAR Multi-Range Semantic Segmentation

Feb 13, 2025

Bin Yang, Alexandru Paul Condurache

Abstract:3D scene understanding is a critical yet challenging task in autonomous driving, primarily due to the irregularity and sparsity of LiDAR data, as well as the computational demands of processing large-scale point clouds. Recent methods leverage the range-view representation to improve processing efficiency. To mitigate the performance drop caused by information loss inherent to the "many-to-one" problem, where multiple nearby 3D points are mapped to the same 2D grids and only the closest is retained, prior works tend to choose a higher azimuth resolution for range-view projection. However, this can bring the drawback of reducing the proportion of pixels that carry information and heavier computation within the network. We argue that it is not the optimal solution and show that, in contrast, decreasing the resolution is more advantageous in both efficiency and accuracy. In this work, we present a comprehensive re-design of the workflow for range-view-based LiDAR semantic segmentation. Our approach addresses data representation, augmentation, and post-processing methods for improvements. Through extensive experiments on two public datasets, we demonstrate that our pipeline significantly enhances the performance of various network architectures over their baselines, paving the way for more effective LiDAR-based perception in autonomous systems.

Via

Access Paper or Ask Questions

Unsupervised Point Cloud Registration with Self-Distillation

Sep 11, 2024

Christian Löwens, Thorben Funke, André Wagner, Alexandru Paul Condurache

Abstract:Rigid point cloud registration is a fundamental problem and highly relevant in robotics and autonomous driving. Nowadays deep learning methods can be trained to match a pair of point clouds, given the transformation between them. However, this training is often not scalable due to the high cost of collecting ground truth poses. Therefore, we present a self-distillation approach to learn point cloud registration in an unsupervised fashion. Here, each sample is passed to a teacher network and an augmented view is passed to a student network. The teacher includes a trainable feature extractor and a learning-free robust solver such as RANSAC. The solver forces consistency among correspondences and optimizes for the unsupervised inlier ratio, eliminating the need for ground truth labels. Our approach simplifies the training procedure by removing the need for initial hand-crafted features or consecutive point cloud frames as seen in related methods. We show that our method not only surpasses them on the RGB-D benchmark 3DMatch but also generalizes well to automotive radar, where classical features adopted by others fail. The code is available at https://github.com/boschresearch/direg .

* Oral at BMVC 2024

Via

Access Paper or Ask Questions

Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing

Sep 29, 2023

Lukas Meiner, Jens Mehnert, Alexandru Paul Condurache

Abstract:To reduce the computational cost of convolutional neural networks (CNNs) for usage on resource-constrained devices, structured pruning approaches have shown promising results, drastically reducing floating-point operations (FLOPs) without substantial drops in accuracy. However, most recent methods require fine-tuning or specific training procedures to achieve a reasonable trade-off between retained accuracy and reduction in FLOPs. This introduces additional cost in the form of computational overhead and requires training data to be available. To this end, we propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module. It instantly reduces the network's test-time inference cost without requiring any training or fine-tuning. We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH) to detect redundancies in the channel dimension. Similar channels are aggregated to reduce the input and filter depth simultaneously, allowing for cheaper convolutions. We demonstrate our approach on the popular vision benchmarks CIFAR-10 and ImageNet. In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.

Via

Access Paper or Ask Questions

Deep Neural Networks with Efficient Guaranteed Invariances

Mar 02, 2023

Matthias Rath, Alexandru Paul Condurache

Abstract:We address the problem of improving the performance and in particular the sample complexity of deep neural networks by enforcing and guaranteeing invariances to symmetry transformations rather than learning them from data. Group-equivariant convolutions are a popular approach to obtain equivariant representations. The desired corresponding invariance is then imposed using pooling operations. For rotations, it has been shown that using invariant integration instead of pooling further improves the sample complexity. In this contribution, we first expand invariant integration beyond rotations to flips and scale transformations. We then address the problem of incorporating multiple desired invariances into a single network. For this purpose, we propose a multi-stream architecture, where each stream is invariant to a different transformation such that the network can simultaneously benefit from multiple invariances. We demonstrate our approach with successful experiments on Scaled-MNIST, SVHN, CIFAR-10 and STL-10.

Via

Access Paper or Ask Questions

Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

May 17, 2022

Paul Wimmer, Jens Mehnert, Alexandru Paul Condurache

Figure 1 for Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

Figure 2 for Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

Figure 3 for Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

Figure 4 for Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

Abstract:State-of-the-art deep learning models have a parameter count that reaches into the billions. Training, storing and transferring such models is energy and time consuming, thus costly. A big part of these costs is caused by training the network. Model compression lowers storage and transfer costs, and can further make training more efficient by decreasing the number of computations in the forward and/or backward pass. Thus, compressing networks also at training time while maintaining a high performance is an important research topic. This work is a survey on methods which reduce the number of trained weights in deep learning models throughout the training. Most of the introduced methods set network parameters to zero which is called pruning. The presented pruning approaches are categorized into pruning at initialization, lottery tickets and dynamic sparse training. Moreover, we discuss methods that freeze parts of a network at its random initialization. By freezing weights, the number of trainable parameters is shrunken which reduces gradient computations and the dimensionality of the model's optimization space. In this survey we first propose dimensionality reduced training as an underlying mathematical model that covers pruning and freezing during training. Afterwards, we present and discuss different dimensionality reduced training methods.

* Survey for pruning and freezing methods applied before training starts

Via

Access Paper or Ask Questions

Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs

Mar 15, 2022

Paul Wimmer, Jens Mehnert, Alexandru Paul Condurache

Figure 1 for Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs

Figure 2 for Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs

Figure 3 for Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs

Figure 4 for Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs

Abstract:Unstructured pruning is well suited to reduce the memory footprint of convolutional neural networks (CNNs), both at training and inference time. CNNs contain parameters arranged in $K \times K$ filters. Standard unstructured pruning (SP) reduces the memory footprint of CNNs by setting filter elements to zero, thereby specifying a fixed subspace that constrains the filter. Especially if pruning is applied before or during training, this induces a strong bias. To overcome this, we introduce interspace pruning (IP), a general tool to improve existing pruning methods. It uses filters represented in a dynamic interspace by linear combinations of an underlying adaptive filter basis (FB). For IP, FB coefficients are set to zero while un-pruned coefficients and FBs are trained jointly. In this work, we provide mathematical evidence for IP's superior performance and demonstrate that IP outperforms SP on all tested state-of-the-art unstructured pruning methods. Especially in challenging situations, like pruning for ImageNet or pruning to high sparsity, IP greatly exceeds SP with equal runtime and parameter costs. Finally, we show that advances of IP are due to improved trainability and superior generalization ability.

* Accepted as conference paper for CVPR 2022

Via

Access Paper or Ask Questions

Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration

Feb 08, 2022

Matthias Rath, Alexandru Paul Condurache

Figure 1 for Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration

Figure 2 for Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration

Figure 3 for Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration

Figure 4 for Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration

Abstract:Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks. This makes them applicable to practically important use-cases where training data is scarce. Rather than being learned, this knowledge can be embedded by enforcing invariance to those transformations. Invariance can be imposed using group-equivariant convolutions followed by a pooling operation. For rotation-invariance, previous work investigated replacing the spatial pooling operation with invariant integration which explicitly constructs invariant representations. Invariant integration uses monomials which are selected using an iterative approach requiring expensive pre-training. We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems. Additionally, we replace monomials with different functions such as weighted sums, multi-layer perceptrons and self-attention, thereby streamlining the training of invariant-integration-based architectures. We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets where rotation-invariant-integration-based Wide-ResNet architectures using monomials and weighted sums outperform the respective baselines in the limited sample regime. We achieve state-of-the-art results using full data on Rotated-MNIST and SVHN where rotation is a main source of intraclass variation. On STL-10 we outperform a standard and a rotation-equivariant convolutional neural network using pooling.

* Accepted at VISAPP 2022

Via

Access Paper or Ask Questions

A Survey on Assessing the Generalization Envelope of Deep Neural Networks at Inference Time for Image Classification

Aug 24, 2020

Julia Lust, Alexandru Paul Condurache

Figure 1 for A Survey on Assessing the Generalization Envelope of Deep Neural Networks at Inference Time for Image Classification

Figure 2 for A Survey on Assessing the Generalization Envelope of Deep Neural Networks at Inference Time for Image Classification

Figure 3 for A Survey on Assessing the Generalization Envelope of Deep Neural Networks at Inference Time for Image Classification

Figure 4 for A Survey on Assessing the Generalization Envelope of Deep Neural Networks at Inference Time for Image Classification

Abstract:Deep Neural Networks (DNNs) achieve state-of-the-art performance on numerous applications. However, it is difficult to tell beforehand if a DNN receiving an input will deliver the correct output since their decision criteria are usually nontransparent. A DNN delivers the correct output if the input is within the area enclosed by its generalization envelope. In this case, the information contained in the input sample is processed reasonably by the network. It is of large practical importance to assess at inference time if a DNN generalizes correctly. Currently, the approaches to achieve this goal are investigated in different problem set-ups rather independently from one another, leading to three main research and literature fields: predictive uncertainty, out-of-distribution detection and adversarial example detection. This survey connects the three fields within the larger framework of investigating the generalization performance of machine learning methods and in particular DNNs. We underline the common ground, point at the most promising approaches and give a structured overview of the methods that provide at inference time means to establish if the current input is within the generalization envelope of a DNN.

Via

Access Paper or Ask Questions