Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oren Gal

AutoLoop: Fast Visual SLAM Fine-tuning through Agentic Curriculum Learning

Jan 15, 2025

Assaf Lahiany, Oren Gal

Abstract:Current visual SLAM systems face significant challenges in balancing computational efficiency with robust loop closure handling. Traditional approaches require careful manual tuning and incur substantial computational overhead, while learning-based methods either lack explicit loop closure capabilities or implement them through computationally expensive methods. We present AutoLoop, a novel approach that combines automated curriculum learning with efficient fine-tuning for visual SLAM systems. Our method employs a DDPG (Deep Deterministic Policy Gradient) agent to dynamically adjust loop closure weights during training, eliminating the need for manual hyperparameter search while significantly reducing the required training steps. The approach pre-computes potential loop closure pairs offline and leverages them through an agent-guided curriculum, allowing the model to adapt efficiently to new scenarios. Experiments conducted on TartanAir for training and validated across multiple benchmarks including KITTI, EuRoC, ICL-NUIM and TUM RGB-D demonstrate that AutoLoop achieves comparable or superior performance while reducing training time by an order of magnitude compared to traditional approaches. AutoLoop provides a practical solution for rapid adaptation of visual SLAM systems, automating the weight tuning process that traditionally requires multiple manual iterations. Our results show that this automated curriculum strategy not only accelerates training but also maintains or improves the model's performance across diverse environmental conditions.

Via

Access Paper or Ask Questions

Unmasking Deepfakes: Leveraging Augmentations and Features Variability for Deepfake Speech Detection

Jan 09, 2025

Inbal Rimon, Oren Gal, Haim Permuter

Abstract:The detection of deepfake speech has become increasingly challenging with the rapid evolution of deepfake technologies. In this paper, we propose a hybrid architecture for deepfake speech detection, combining a self-supervised learning framework for feature extraction with a classifier head to form an end-to-end model. Our approach incorporates both audio-level and feature-level augmentation techniques. Specifically, we introduce and analyze various masking strategies for augmenting raw audio spectrograms and for enhancing feature representations during training. We incorporate compression augmentations during the pretraining phase of the feature extractor to address the limitations of small, single-language datasets. We evaluate the model on the ASVSpoof5 (ASVSpoof 2024) challenge, achieving state-of-the-art results in Track 1 under closed conditions with an Equal Error Rate of 4.37%. By employing different pretrained feature extractors, the model achieves an enhanced EER of 3.39%. Our model demonstrates robust performance against unseen deepfake attacks and exhibits strong generalization across different codecs.

Via

Access Paper or Ask Questions

Robust Monocular Visual Odometry using Curriculum Learning

Nov 20, 2024

Assaf Lahiany, Oren Gal

Abstract:Curriculum Learning (CL), drawing inspiration from natural learning patterns observed in humans and animals, employs a systematic approach of gradually introducing increasingly complex training data during model development. Our work applies innovative CL methodologies to address the challenging geometric problem of monocular Visual Odometry (VO) estimation, which is essential for robot navigation in constrained environments. The primary objective of our research is to push the boundaries of current state-of-the-art (SOTA) benchmarks in monocular VO by investigating various curriculum learning strategies. We enhance the end-to-end Deep-Patch-Visual Odometry (DPVO) framework through the integration of novel CL approaches, with the goal of developing more resilient models capable of maintaining high performance across challenging environments and complex motion scenarios. Our research encompasses several distinctive CL strategies. We develop methods to evaluate sample difficulty based on trajectory motion characteristics, implement sophisticated adaptive scheduling through self-paced weighted loss mechanisms, and utilize reinforcement learning agents for dynamic adjustment of training emphasis. Through comprehensive evaluation on the real-world TartanAir dataset, our Curriculum Learning-based Deep-Patch-Visual Odometry (CL-DPVO) demonstrates superior performance compared to existing SOTA methods, including both feature-based and learning-based VO approaches. The results validate the effectiveness of integrating curriculum learning principles into visual odometry systems.

* 8 pages

Via

Access Paper or Ask Questions

Adaptive USVs Swarm Optimization for Target Tracking in Dynamic Environments

Aug 13, 2024

Oren Gal

Abstract:This research investigates the performance and efficiency of Unmanned Surface Vehicles (USVs) in multi-target tracking scenarios using the Adaptive Particle Swarm Optimization with k-Nearest Neighbors (APSO-kNN) algorithm. The study explores various search patterns-Random Walk, Spiral, Lawnmower, and Cluster Search to assess their effectiveness in dynamic environments. Through extensive simulations, we evaluate the impact of different search strategies, varying the number of targets and USVs' sensing capabilities, and integrating a Pursuit-Evasion model to test adaptability. Our findings demonstrate that systematic search patterns like Spiral and Lawnmower provide superior coverage and tracking accuracy, making them ideal for thorough area exploration. In contrast, the Random Walk pattern, while highly adaptable, shows lower accuracy due to its non-deterministic nature, and Cluster Search maintains group cohesion but is heavily dependent on target distribution. The mixed strategy, combining multiple patterns, offers robust performance across varied scenarios, while APSO-kNN effectively balances exploration and exploitation, making it a promising approach for real-world applications such as surveillance, search and rescue, and environmental monitoring. This study provides valuable insights into optimizing search strategies and sensing configurations for USV swarms, ultimately enhancing their operational efficiency and success in complex environments.

* 9 pages

Via

Access Paper or Ask Questions

Learning to Explore Indoor Environments using Autonomous Micro Aerial Vehicles

Sep 13, 2023

Yuezhan Tao, Eran Iceland, Beiming Li, Elchanan Zwecher, Uri Heinemann, Avraham Cohen, Amir Avni, Oren Gal, Ariel Barel, Vijay Kumar

Abstract:In this paper, we address the challenge of exploring unknown indoor aerial environments using autonomous aerial robots with Size Weight and Power (SWaP) constraints. The SWaP constraints induce limits on mission time requiring efficiency in exploration. We present a novel exploration framework that uses Deep Learning (DL) to predict the most likely indoor map given the previous observations, and Deep Reinforcement Learning (DRL) for exploration, designed to run on modern SWaP constraints neural processors. The DL-based map predictor provides a prediction of the occupancy of the unseen environment while the DRL-based planner determines the best navigation goals that can be safely reached to provide the most information. The two modules are tightly coupled and run onboard allowing the vehicle to safely map an unknown environment. Extensive experimental and simulation results show that our approach surpasses state-of-the-art methods by 50-60% in efficiency, which we measure by the fraction of the explored space as a function of the length of the trajectory traveled.

* Submitted to ICRA2024 for review

Via

Access Paper or Ask Questions

Agility and Target Distribution in the Dynamic Stochastic Traveling Salesman Problem

Feb 01, 2023

Aviv Adler, Oren Gal, Sertac Karaman

Abstract:An important variant of the classic Traveling Salesman Problem (TSP) is the Dynamic TSP, in which a system with dynamic constraints is tasked with visiting a set of n target locations (in any order) in the shortest amount of time. Such tasks arise naturally in many robotic motion planning problems, particularly in exploration, surveillance and reconnaissance, and classical TSP algorithms on graphs are typically inapplicable in this setting. An important question about such problems is: if the target points are random, what is the length of the tour (either in expectation or as a concentration bound) as n grows? This problem is the Dynamic Stochastic TSP (DSTSP), and has been studied both for specific important vehicle models and for general dynamic systems; however, in general only the order of growth is known. In this work, we explore the connection between the distribution from which the targets are drawn and the dynamics of the system, yielding a more precise lower bound on tour length as well as a matching upper bound for the case of symmetric (or driftless) systems. We then extend the symmetric dynamics results to the case when the points are selected by a (non-random) adversary whose goal is to maximize the length, thus showing worst-case bounds on the tour length.

* 106 pages

Via

Access Paper or Ask Questions

Deep Learning on Home Drone: Searching for the Optimal Architecture

Sep 21, 2022

Alaa Maalouf, Yotam Gurfinkel, Barak Diker, Oren Gal, Daniela Rus, Dan Feldman

Figure 1 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Figure 2 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Figure 3 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Figure 4 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Abstract:We suggest the first system that runs real-time semantic segmentation via deep learning on a weak micro-computer such as the Raspberry Pi Zero v2 (whose price was \$15) attached to a toy-drone. In particular, since the Raspberry Pi weighs less than $16$ grams, and its size is half of a credit card, we could easily attach it to the common commercial DJI Tello toy-drone (<\$100, <90 grams, 98 $\times$ 92.5 $\times$ 41 mm). The result is an autonomous drone (no laptop nor human in the loop) that can detect and classify objects in real-time from a video stream of an on-board monocular RGB camera (no GPS or LIDAR sensors). The companion videos demonstrate how this Tello drone scans the lab for people (e.g. for the use of firefighters or security forces) and for an empty parking slot outside the lab. Existing deep learning solutions are either much too slow for real-time computation on such IoT devices, or provide results of impractical quality. Our main challenge was to design a system that takes the best of all worlds among numerous combinations of networks, deep learning platforms/frameworks, compression techniques, and compression ratios. To this end, we provide an efficient searching algorithm that aims to find the optimal combination which results in the best tradeoff between the network running time and its accuracy/performance.

Via

Access Paper or Ask Questions

Integrating Deep Reinforcement and Supervised Learning to Expedite Indoor Mapping

Sep 17, 2021

Elchanan Zwecher, Eran Iceland, Sean R. Levy, Shmuel Y. Hayoun, Oren Gal, Ariel Barel

Figure 1 for Integrating Deep Reinforcement and Supervised Learning to Expedite Indoor Mapping

Figure 2 for Integrating Deep Reinforcement and Supervised Learning to Expedite Indoor Mapping

Figure 3 for Integrating Deep Reinforcement and Supervised Learning to Expedite Indoor Mapping

Figure 4 for Integrating Deep Reinforcement and Supervised Learning to Expedite Indoor Mapping

Abstract:The challenge of mapping indoor environments is addressed. Typical heuristic algorithms for solving the motion planning problem are frontier-based methods, that are especially effective when the environment is completely unknown. However, in cases where prior statistical data on the environment's architectonic features is available, such algorithms can be far from optimal. Furthermore, their calculation time may increase substantially as more areas are exposed. In this paper we propose two means by which to overcome these shortcomings. One is the use of deep reinforcement learning to train the motion planner. The second is the inclusion of a pre-trained generative deep neural network, acting as a map predictor. Each one helps to improve the decision making through use of the learned structural statistics of the environment, and both, being realized as neural networks, ensure a constant calculation time. We show that combining the two methods can shorten the mapping time, compared to frontier-based motion planning, by up to 75%.

* Submitted to ICRA-22 conference (September 14th, 2021)

Via

Access Paper or Ask Questions

Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Jul 23, 2021

Lucas Liebenwein, Alaa Maalouf, Oren Gal, Dan Feldman, Daniela Rus

Figure 1 for Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Figure 2 for Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Figure 3 for Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Figure 4 for Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Abstract:We present a novel global compression framework for deep neural networks that automatically analyzes each layer to identify the optimal per-layer compression ratio, while simultaneously achieving the desired overall compression. Our algorithm hinges on the idea of compressing each convolutional (or fully-connected) layer by slicing its channels into multiple groups and decomposing each group via low-rank decomposition. At the core of our algorithm is the derivation of layer-wise error bounds from the Eckart Young Mirsky theorem. We then leverage these bounds to frame the compression problem as an optimization problem where we wish to minimize the maximum compression error across layers and propose an efficient algorithm towards a solution. Our experiments indicate that our method outperforms existing low-rank compression approaches across a wide range of networks and data sets. We believe that our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks. Our code is available at https://github.com/lucaslie/torchprune.

Via

Access Paper or Ask Questions