Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nadiya Shvai

Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning

Jan 09, 2025

Dmytro Kuzmenko, Nadiya Shvai

Abstract:We propose an efficient knowledge transfer approach for model-based reinforcement learning, addressing the challenge of deploying large world models in resource-constrained environments. Our method distills a high-capacity multi-task agent (317M parameters) into a compact 1M parameter model, achieving state-of-the-art performance on the MT30 benchmark with a normalized score of 28.45, a substantial improvement over the original 1M parameter model's score of 18.93. This demonstrates the ability of our distillation technique to consolidate complex multi-task knowledge effectively. Additionally, we apply FP16 post-training quantization, reducing the model size by 50% while maintaining performance. Our work bridges the gap between the power of large models and practical deployment constraints, offering a scalable solution for efficient and accessible multi-task reinforcement learning in robotics and other resource-limited domains.

* Preprint of an extended abstract accepted to AAMAS 2025

Via

Access Paper or Ask Questions

License Plate Images Generation with Diffusion Models

Jan 06, 2025

Mariia Shpir, Nadiya Shvai, Amir Nakib

Abstract:Despite the evident practical importance of license plate recognition (LPR), corresponding research is limited by the volume of publicly available datasets due to privacy regulations such as the General Data Protection Regulation (GDPR). To address this challenge, synthetic data generation has emerged as a promising approach. In this paper, we propose to synthesize realistic license plates (LPs) using diffusion models, inspired by recent advances in image and video generation. In our experiments a diffusion model was successfully trained on a Ukrainian LP dataset, and 1000 synthetic images were generated for detailed analysis. Through manual classification and annotation of the generated images, we performed a thorough study of the model output, such as success rate, character distributions, and type of failures. Our contributions include experimental validation of the efficacy of diffusion models for LP synthesis, along with insights into the characteristics of the generated data. Furthermore, we have prepared a synthetic dataset consisting of 10,000 LP images, publicly available at https://zenodo.org/doi/10.5281/zenodo.13342102. Conducted experiments empirically confirm the usefulness of synthetic data for the LPR task. Despite the initial performance gap between the model trained with real and synthetic data, the expansion of the training data set with pseudolabeled synthetic data leads to an improvement in LPR accuracy by 3% compared to baseline.

* Frontiers in Artificial Intelligence and Applications, Vol. 392, 2024, pp. 4594-4601

Via

Access Paper or Ask Questions

Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

Jun 19, 2024

Mykhailo Poliakov, Nadiya Shvai

Abstract:The retrieval-augmented generation (RAG) enables retrieval of relevant information from an external knowledge source and allows large language models (LLMs) to answer queries over previously unseen document collections. However, it was demonstrated that traditional RAG applications perform poorly in answering multi-hop questions, which require retrieving and reasoning over multiple elements of supporting evidence. We introduce a new method called Multi-Meta-RAG, which uses database filtering with LLM-extracted metadata to improve the RAG selection of the relevant documents from various sources, relevant to the question. While database filtering is specific to a set of questions from a particular domain and format, we found out that Multi-Meta-RAG greatly improves the results on the MultiHop-RAG benchmark. The code is available at https://github.com/mxpoliakov/Multi-Meta-RAG.

* Submitted to ICTERI 2024 Posters Track

Via

Access Paper or Ask Questions

Balancing Performance and Efficiency in Zero-shot Robotic Navigation

Jun 05, 2024

Dmytro Kuzmenko, Nadiya Shvai

Abstract:We present an optimization study of the Vision-Language Frontier Maps (VLFM) applied to the Object Goal Navigation task in robotics. Our work evaluates the efficiency and performance of various vision-language models, object detectors, segmentation models, and multi-modal comprehension and Visual Question Answering modules. Using the $\textit{val-mini}$ and $\textit{val}$ splits of Habitat-Matterport 3D dataset, we conduct experiments on a desktop with limited VRAM. We propose a solution that achieves a higher success rate (+1.55%) improving over the VLFM BLIP-2 baseline without substantial success-weighted path length loss while requiring $\textbf{2.3 times}$ less video memory. Our findings provide insights into balancing model performance and computational efficiency, suggesting effective deployment strategies for resource-limited environments.

* Submitted to ICTERI 2024 Posters Track

Via

Access Paper or Ask Questions

Dynamic camera alignment optimization problem based on Fractal Decomposition based Algorithm

Sep 21, 2022

Arcadi Llanza, Nadiya Shvai, Amir Nakib

Figure 1 for Dynamic camera alignment optimization problem based on Fractal Decomposition based Algorithm

Figure 2 for Dynamic camera alignment optimization problem based on Fractal Decomposition based Algorithm

Figure 3 for Dynamic camera alignment optimization problem based on Fractal Decomposition based Algorithm

Figure 4 for Dynamic camera alignment optimization problem based on Fractal Decomposition based Algorithm

Abstract:In this work, we tackle the Dynamic Optimization Problem (DOP) of IA in a real-world application using a Dynamic Optimization Algorithm (DOA) called Fractal Decomposition Algorithm (FDA), introduced by recently. We used FDA to perform IA on CCTV camera feed from a tunnel. As the camera viewpoint can change by multiple reasons such as wind, maintenance, etc. the alignment is required to guarantee the correct functioning of video-based traffic security system.

Via

Access Paper or Ask Questions