Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joni-Kristian Kämäräinen

Minimal Time Series Transformer

Mar 12, 2025

Joni-Kristian Kämäräinen

Abstract:Transformer is the state-of-the-art model for many natural language processing, computer vision, and audio analysis problems. Transformer effectively combines information from the past input and output samples in auto-regressive manner so that each sample becomes aware of all inputs and outputs. In sequence-to-sequence (Seq2Seq) modeling, the transformer processed samples become effective in predicting the next output. Time series forecasting is a Seq2Seq problem. The original architecture is defined for discrete input and output sequence tokens, but to adopt it for time series, the model must be adapted for continuous data. This work introduces minimal adaptations to make the original transformer architecture suitable for continuous value time series data.

* 8 pages, 8 figures

Via

Access Paper or Ask Questions

Introduction to Sequence Modeling with Transformers

Feb 26, 2025

Joni-Kristian Kämäräinen

Abstract:Understanding the transformer architecture and its workings is essential for machine learning (ML) engineers. However, truly understanding the transformer architecture can be demanding, even if you have a solid background in machine learning or deep learning. The main working horse is attention, which yields to the transformer encoder-decoder structure. However, putting attention aside leaves several programming components that are easy to implement but whose role for the whole is unclear. These components are 'tokenization', 'embedding' ('un-embedding'), 'masking', 'positional encoding', and 'padding'. The focus of this work is on understanding them. To keep things simple, the understanding is built incrementally by adding components one by one, and after each step investigating what is doable and what is undoable with the current model. Simple sequences of zeros (0) and ones (1) are used to study the workings of each step.

* 10 pages, 1 figure

Via

Access Paper or Ask Questions

DAVIDE: Depth-Aware Video Deblurring

Sep 02, 2024

German F. Torres, Jussi Kalliola, Soumya Tripathy, Erman Acar, Joni-Kristian Kämäräinen

Abstract:Video deblurring aims at recovering sharp details from a sequence of blurry frames. Despite the proliferation of depth sensors in mobile phones and the potential of depth information to guide deblurring, depth-aware deblurring has received only limited attention. In this work, we introduce the 'Depth-Aware VIdeo DEblurring' (DAVIDE) dataset to study the impact of depth information in video deblurring. The dataset comprises synchronized blurred, sharp, and depth videos. We investigate how the depth information should be injected into the existing deep RGB video deblurring models, and propose a strong baseline for depth-aware video deblurring. Our findings reveal the significance of depth information in video deblurring and provide insights into the use cases where depth cues are beneficial. In addition, our results demonstrate that while the depth improves deblurring performance, this effect diminishes when models are provided with a longer temporal context. Project page: https://germanftv.github.io/DAVIDE.github.io/ .

Via

Access Paper or Ask Questions

Probabilistic Subgoal Representations for Hierarchical Reinforcement learning

Jun 24, 2024

Vivienne Huiling Wang, Tinghuai Wang, Wenyan Yang, Joni-Kristian Kämäräinen, Joni Pajarinen

Figure 1 for Probabilistic Subgoal Representations for Hierarchical Reinforcement learning

Figure 2 for Probabilistic Subgoal Representations for Hierarchical Reinforcement learning

Figure 3 for Probabilistic Subgoal Representations for Hierarchical Reinforcement learning

Figure 4 for Probabilistic Subgoal Representations for Hierarchical Reinforcement learning

Abstract:In goal-conditioned hierarchical reinforcement learning (HRL), a high-level policy specifies a subgoal for the low-level policy to reach. Effective HRL hinges on a suitable subgoal represen tation function, abstracting state space into latent subgoal space and inducing varied low-level behaviors. Existing methods adopt a subgoal representation that provides a deterministic mapping from state space to latent subgoal space. Instead, this paper utilizes Gaussian Processes (GPs) for the first probabilistic subgoal representation. Our method employs a GP prior on the latent subgoal space to learn a posterior distribution over the subgoal representation functions while exploiting the long-range correlation in the state space through learnable kernels. This enables an adaptive memory that integrates long-range subgoal information from prior planning steps allowing to cope with stochastic uncertainties. Furthermore, we propose a novel learning objective to facilitate the simultaneous learning of probabilistic subgoal representations and policies within a unified framework. In experiments, our approach outperforms state-of-the-art baselines in standard benchmarks but also in environments with stochastic elements and under diverse reward conditions. Additionally, our model shows promising capabilities in transferring low-level policies across different tasks.

Via

Access Paper or Ask Questions

PlaceNav: Topological Navigation through Place Recognition

Oct 05, 2023

Lauri Suomela, Jussi Kalliola, Harry Edelman, Joni-Kristian Kämäräinen

Figure 1 for PlaceNav: Topological Navigation through Place Recognition

Figure 2 for PlaceNav: Topological Navigation through Place Recognition

Figure 3 for PlaceNav: Topological Navigation through Place Recognition

Figure 4 for PlaceNav: Topological Navigation through Place Recognition

Abstract:Recent results suggest that splitting topological navigation into robot-independent and robot-specific components improves navigation performance by enabling the robot-independent part to be trained with data collected by different robot types. However, the navigation methods are still limited by the scarcity of suitable training data and suffer from poor computational scaling. In this work, we present PlaceNav, subdividing the robot-independent part into navigation-specific and generic computer vision components. We utilize visual place recognition for the subgoal selection of the topological navigation pipeline. This makes subgoal selection more efficient and enables leveraging large-scale datasets from non-robotics sources, increasing training data availability. Bayesian filtering, enabled by place recognition, further improves navigation performance by increasing the temporal consistency of subgoals. Our experimental results verify the design and the new model obtains a 76% higher success rate in indoor and 23% higher in outdoor navigation tasks with higher computational efficiency.

Via

Access Paper or Ask Questions

Depth-Aware Image Compositing Model for Parallax Camera Motion Blur

Mar 30, 2023

German F. Torres, Joni-Kristian Kämäräinen

Abstract:Camera motion introduces spatially varying blur due to the depth changes in the 3D world. This work investigates scene configurations where such blur is produced under parallax camera motion. We present a simple, yet accurate, Image Compositing Blur (ICB) model for depth-dependent spatially varying blur. The (forward) model produces realistic motion blur from a single image, depth map, and camera trajectory. Furthermore, we utilize the ICB model, combined with a coordinate-based MLP, to learn a sharp neural representation from the blurred input. Experimental results are reported for synthetic and real examples. The results verify that the ICB forward model is computationally efficient and produces realistic blur, despite the lack of occlusion information. Additionally, our method for restoring a sharp representation proves to be a competitive approach for the deblurring task.

Via

Access Paper or Ask Questions

Seq2Seq Imitation Learning for Tactile Feedback-based Manipulation

Mar 05, 2023

Wenyan Yang, Alexandre Angleraud, Roel S. Pieters, Joni Pajarinen, Joni-Kristian Kämäräinen

Abstract:Robot control for tactile feedback-based manipulation can be difficult due to the modeling of physical contacts, partial observability of the environment, and noise in perception and control. This work focuses on solving partial observability of contact-rich manipulation tasks as a Sequence-to-Sequence (Seq2Seq)} Imitation Learning (IL) problem. The proposed Seq2Seq model produces a robot-environment interaction sequence to estimate the partially observable environment state variables. Then, the observed interaction sequence is transformed to a control sequence for the task itself. The proposed Seq2Seq IL for tactile feedback-based manipulation is experimentally validated on a door-open task in a simulated environment and a snap-on insertion task with a real robot. The model is able to learn both tasks from only 50 expert demonstrations, while state-of-the-art reinforcement learning and imitation learning methods fail.

Via

Access Paper or Ask Questions

A Simulation Benchmark for Vision-based Autonomous Navigation

Apr 01, 2022

Lauri Suomela, Atakan Dag, Harry Edelman, Joni-Kristian Kämäräinen

Figure 1 for A Simulation Benchmark for Vision-based Autonomous Navigation

Figure 2 for A Simulation Benchmark for Vision-based Autonomous Navigation

Figure 3 for A Simulation Benchmark for Vision-based Autonomous Navigation

Figure 4 for A Simulation Benchmark for Vision-based Autonomous Navigation

Abstract:This work introduces a simulator benchmark for vision-based autonomous navigation. The simulator offers control over real world variables such as the environment, time of day, weather and traffic. The benchmark includes a modular integration of different components of a full autonomous visual navigation stack. In the experimental part of the paper, state-of-the-art visual localization methods are evaluated as a part of the stack in realistic navigation tasks. To the authors' best knowledge, the proposed benchmark is the first to study modern visual localization methods as part of a full autonomous visual navigation stack.

Via

Access Paper or Ask Questions

RGBD Object Tracking: An In-depth Review

Mar 26, 2022

Jinyu Yang, Zhe Li, Song Yan, Feng Zheng, Aleš Leonardis, Joni-Kristian Kämäräinen, Ling Shao

Figure 1 for RGBD Object Tracking: An In-depth Review

Figure 2 for RGBD Object Tracking: An In-depth Review

Figure 3 for RGBD Object Tracking: An In-depth Review

Figure 4 for RGBD Object Tracking: An In-depth Review

Abstract:RGBD object tracking is gaining momentum in computer vision research thanks to the development of depth sensors. Although numerous RGBD trackers have been proposed with promising performance, an in-depth review for comprehensive understanding of this area is lacking. In this paper, we firstly review RGBD object trackers from different perspectives, including RGBD fusion, depth usage, and tracking framework. Then, we summarize the existing datasets and the evaluation metrics. We benchmark a representative set of RGBD trackers, and give detailed analyses based on their performances. Particularly, we are the first to provide depth quality evaluation and analysis of tracking results in depth-friendly scenarios in RGBD tracking. For long-term settings in most RGBD tracking videos, we give an analysis of trackers' performance on handling target disappearance. To enable better understanding of RGBD trackers, we propose robustness evaluation against input perturbations. Finally, we summarize the challenges and provide open directions for this community. All resources are publicly available at https://github.com/memoryunreal/RGBD-tracking-review.

* 13 pages

Via

Access Paper or Ask Questions

Hierarchical Reinforcement Learning with Adversarially Guided Subgoals

Jan 31, 2022

Vivienne Huiling Wang, Joni Pajarinen, Tinghuai Wang, Joni-Kristian Kämäräinen

Figure 1 for Hierarchical Reinforcement Learning with Adversarially Guided Subgoals

Figure 2 for Hierarchical Reinforcement Learning with Adversarially Guided Subgoals

Figure 3 for Hierarchical Reinforcement Learning with Adversarially Guided Subgoals

Figure 4 for Hierarchical Reinforcement Learning with Adversarially Guided Subgoals

Abstract:Hierarchical reinforcement learning (HRL) proposes to solve difficult tasks by performing decision-making and control at successively higher levels of temporal abstraction. However, off-policy HRL often suffers from the problem of non-stationary high-level policy since the low-level policy is constantly changing. In this paper, we propose a novel HRL approach for mitigating the non-stationarity by adversarially enforcing the high-level policy to generate subgoals compatible with the current instantiation of the low-level policy. In practice, the adversarial learning is implemented by training a simple discriminator network concurrently with the high-level policy which determines the compatibility level of subgoals. Experiments with state-of-the-art algorithms show that our approach improves both HRL learning efficiency and overall performance in various challenging continuous control tasks.

Via

Access Paper or Ask Questions