Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Iason Oikonomidis

Y-MAP-Net: Real-time depth, normals, segmentation, multi-label captioning and 2D human pose in RGB images

Nov 15, 2024

Ammar Qammaz, Nikolaos Vasilikopoulos, Iason Oikonomidis, Antonis A. Argyros

Abstract:We present Y-MAP-Net, a Y-shaped neural network architecture designed for real-time multi-task learning on RGB images. Y-MAP-Net, simultaneously predicts depth, surface normals, human pose, semantic segmentation and generates multi-label captions, all from a single network evaluation. To achieve this, we adopt a multi-teacher, single-student training paradigm, where task-specific foundation models supervise the network's learning, enabling it to distill their capabilities into a lightweight architecture suitable for real-time applications. Y-MAP-Net, exhibits strong generalization, simplicity and computational efficiency, making it ideal for robotics and other practical scenarios. To support future research, we will release our code publicly.

* 8 page paper, 6 Figures, 3 Tables

Via

Access Paper or Ask Questions

Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Jul 12, 2021

Giorgos Karvounas, Nikolaos Kyriazis, Iason Oikonomidis, Aggeliki Tsoli, Antonis A. Argyros

Figure 1 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Figure 2 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Figure 3 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Figure 4 for Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing

Abstract:The amount and quality of datasets and tools available in the research field of hand pose and shape estimation act as evidence to the significant progress that has been made. We find that there is still room for improvement in both fronts, and even beyond. Even the datasets of the highest quality, reported to date, have shortcomings in annotation. There are tools in the literature that can assist in that direction and yet they have not been considered, so far. To demonstrate how these gaps can be bridged, we employ such a publicly available, multi-camera dataset of hands (InterHand2.6M), and perform effective image-based refinement to improve on the imperfect ground truth annotations, yielding a better dataset. The image-based refinement is achieved through raytracing, a method that has not been employed so far to relevant problems and is hereby shown to be superior to the approximative alternatives that have been employed in the past. To tackle the lack of reliable ground truth, we resort to realistic synthetic data, to show that the improvement we induce is indeed significant, qualitatively, and quantitatively, too.

Via

Access Paper or Ask Questions

Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics

Jul 08, 2021

Dennis Bautembach, Iason Oikonomidis, Antonis Argyros

Figure 1 for Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics

Figure 2 for Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics

Figure 3 for Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics

Figure 4 for Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics

Abstract:We present two novel optimizations that accelerate clock-based spiking neural network (SNN) simulators. The first one targets spike timing dependent plasticity (STDP). It combines lazy- with event-driven plasticity and efficiently facilitates the computation of pre- and post-synaptic spikes using bitfields and integer intrinsics. It offers higher bandwidth than event-driven plasticity alone and achieves a 1.5x-2x speedup over our closest competitor. The second optimization targets spike delivery. We partition our graph representation in a way that bounds the number of neurons that need be updated at any given time which allows us to perform said update in shared memory instead of global memory. This is 2x-2.5x faster than our closest competitor. Both optimizations represent the final evolutionary stages of years of iteration on STDP and spike delivery inside "Spice" (/spaIk/), our state of the art SNN simulator. The proposed optimizations are not exclusive to our graph representation or pipeline but are applicable to a multitude of simulator designs. We evaluate our performance on three well-established models and compare ourselves against three other state of the art simulators.

* Submitted to IEEE-HPEC 2021

Via

Access Paper or Ask Questions

H-GAN: the power of GANs in your Hands

Apr 21, 2021

Sergiu Oprea, Giorgos Karvounas, Pablo Martinez-Gonzalez, Nikolaos Kyriazis, Sergio Orts-Escolano, Iason Oikonomidis, Alberto Garcia-Garcia, Aggeliki Tsoli, Jose Garcia-Rodriguez, Antonis Argyros

Figure 1 for H-GAN: the power of GANs in your Hands

Figure 2 for H-GAN: the power of GANs in your Hands

Figure 3 for H-GAN: the power of GANs in your Hands

Figure 4 for H-GAN: the power of GANs in your Hands

Abstract:We present HandGAN (H-GAN), a cycle-consistent adversarial learning approach implementing multi-scale perceptual discriminators. It is designed to translate synthetic images of hands to the real domain. Synthetic hands provide complete ground-truth annotations, yet they are not representative of the target distribution of real-world data. We strive to provide the perfect blend of a realistic hand appearance with synthetic annotations. Relying on image-to-image translation, we improve the appearance of synthetic hands to approximate the statistical distribution underlying a collection of real images of hands. H-GAN tackles not only the cross-domain tone mapping but also structural differences in localized areas such as shading discontinuities. Results are evaluated on a qualitative and quantitative basis improving previous works. Furthermore, we relied on the hand classification task to claim our generated hands are statistically similar to the real domain of hands.

* Paper accepted at The International Joint Conference on Neural Networks (IJCNN) 2021

Via

Access Paper or Ask Questions

Multi-GPU SNN Simulation with Perfect Static Load Balancing

Feb 09, 2021

Dennis Bautembach, Iason Oikonomidis, Antonis Argyros

Figure 1 for Multi-GPU SNN Simulation with Perfect Static Load Balancing

Figure 2 for Multi-GPU SNN Simulation with Perfect Static Load Balancing

Figure 3 for Multi-GPU SNN Simulation with Perfect Static Load Balancing

Figure 4 for Multi-GPU SNN Simulation with Perfect Static Load Balancing

Abstract:We present a SNN simulator which scales to millions of neurons, billions of synapses, and 8 GPUs. This is made possible by 1) a novel, cache-aware spike transmission algorithm 2) a model parallel multi-GPU distribution scheme and 3) a static, yet very effective load balancing strategy. The simulator further features an easy to use API and the ability to create custom models. We compare the proposed simulator against two state of the art ones on a series of benchmarks using three well-established models. We find that our simulator is faster, consumes less memory, and scales linearly with the number of GPUs.

* Submitted to IJCNN 2021

Via

Access Paper or Ask Questions

Faster and Simpler SNN Simulation with Work Queues

Dec 17, 2019

Dennis Bautembach, Iason Oikonomidis, Nikolaos Kyriazis, Antonis Argyros

Figure 1 for Faster and Simpler SNN Simulation with Work Queues

Figure 2 for Faster and Simpler SNN Simulation with Work Queues

Figure 3 for Faster and Simpler SNN Simulation with Work Queues

Figure 4 for Faster and Simpler SNN Simulation with Work Queues

Abstract:We present a clock-driven Spiking Neural Network simulator which is up to 3x faster than the state of the art while, at the same time, being more general and requiring less programming effort on both the user's and maintainer's side. This is made possible by designing our pipeline around "work queues" which act as interfaces between stages and greatly reduce implementation complexity. We evaluate our work using three well-established SNN models on a series of benchmarks.

* Updated table references

Via

Access Paper or Ask Questions

ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos

Oct 14, 2019

Giorgos Karvounas, Iason Oikonomidis, Antonis Argyros

Figure 1 for ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos

Figure 2 for ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos

Figure 3 for ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos

Figure 4 for ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos

Abstract:We address the problem of temporal localization of repetitive activities in a video, i.e., the problem of identifying all segments of a video that contain some sort of repetitive or periodic motion. To do so, the proposed method represents a video by the matrix of pairwise frame distances. These distances are computed on frame representations obtained with a convolutional neural network. On top of this representation, we design, implement and evaluate ReActNet, a lightweight convolutional neural network that classifies a given frame as belonging (or not) to a repetitive video segment. An important property of the employed representation is that it can handle repetitive segments of arbitrary number and duration. Furthermore, the proposed training process requires a relatively small number of annotated videos. Our method raises several of the limiting assumptions of existing approaches regarding the contents of the video and the types of the observed repetitive activities. Experimental results on recent, publicly available datasets validate our design choices, verify the generalization potential of ReActNet and demonstrate its superior performance in comparison to the current state of the art.

* Accepted for presentation as a regular paper in the Intelligent ShortVideo workshop, organized in conjunction with ICCV 2019

Via

Access Paper or Ask Questions

Accurate Hand Keypoint Localization on Mobile Devices

Dec 19, 2018

Filippos Gouidis, Paschalis Panteleris, Iason Oikonomidis, Antonis Argyros

Figure 1 for Accurate Hand Keypoint Localization on Mobile Devices

Figure 2 for Accurate Hand Keypoint Localization on Mobile Devices

Figure 3 for Accurate Hand Keypoint Localization on Mobile Devices

Figure 4 for Accurate Hand Keypoint Localization on Mobile Devices

Abstract:We present a novel approach for 2D hand keypoint localization from regular color input. The proposed approach relies on an appropriately designed Convolutional Neural Network (CNN) that computes a set of heatmaps, one per hand keypoint of interest. Extensive experiments with the proposed method compare it against state of the art approaches and demonstrate its accuracy and computational performance on standard, publicly available datasets. The obtained results demonstrate that the proposed method matches or outperforms the competing methods in accuracy, but clearly outperforms them in computational efficiency, making it a suitable building block for applications that require hand keypoint estimation on mobile devices.

Via

Access Paper or Ask Questions

Learning to Infer the Depth Map of a Hand from its Color Image

Dec 06, 2018

Vassilis C. Nicodemou, Iason Oikonomidis, Georgios Tzimiropoulos, Antonis Argyros

Figure 1 for Learning to Infer the Depth Map of a Hand from its Color Image

Figure 2 for Learning to Infer the Depth Map of a Hand from its Color Image

Figure 3 for Learning to Infer the Depth Map of a Hand from its Color Image

Figure 4 for Learning to Infer the Depth Map of a Hand from its Color Image

Abstract:We propose the first approach to the problem of inferring the depth map of a human hand based on a single RGB image. We achieve this with a Convolutional Neural Network (CNN) that employs a stacked hourglass model as its main building block. Intermediate supervision is used in several outputs of the proposed architecture in a staged approach. To aid the process of training and inference, hand segmentation masks are also estimated in such an intermediate supervision step, and used to guide the subsequent depth estimation process. In order to train and evaluate the proposed method we compile and make publicly available HandRGBD, a new dataset of 20,601 views of hands, each consisting of an RGB image and an aligned depth map. Based on HandRGBD, we explore variants of the proposed approach in an ablative study and determine the best performing one. The results of an extensive experimental evaluation demonstrate that hand depth estimation from a single RGB frame can be achieved with an accuracy of 22mm, which is comparable to the accuracy achieved by contemporary low-cost depth cameras. Such a 3D reconstruction of hands based on RGB information is valuable as a final result on its own right, but also as an input to several other hand analysis and perception algorithms that require depth input. Essentially, in such a context, the proposed approach bridges the gap between RGB and RGBD, by making all existing RGBD-based methods applicable to RGB input.

Via

Access Paper or Ask Questions

HANDS18: Methods, Techniques and Applications for Hand Observation

Oct 25, 2018

Iason Oikonomidis, Guillermo Garcia-Hernando, Angela Yao, Antonis Argyros, Vincent Lepetit, Tae-Kyun Kim

Figure 1 for HANDS18: Methods, Techniques and Applications for Hand Observation

Figure 2 for HANDS18: Methods, Techniques and Applications for Hand Observation

Abstract:This report outlines the proceedings of the Fourth International Workshop on Observing and Understanding Hands in Action (HANDS 2018). The fourth instantiation of this workshop attracted significant interest from both academia and the industry. The program of the workshop included regular papers that are published as the workshop's proceedings, extended abstracts, invited posters, and invited talks. Topics of the submitted works and invited talks and posters included novel methods for hand pose estimation from RGB, depth, or skeletal data, datasets for special cases and real-world applications, and techniques for hand motion re-targeting and hand gesture recognition. The invited speakers are leaders in their respective areas of specialization, coming from both industry and academia. The main conclusions that can be drawn are the turn of the community towards RGB data and the maturation of some methods and techniques, which in turn has led to increasing interest for real-world applications.

* 11 pages, 1 figure, Discussion of the HANDS 2018 workshop held in conjunction with ECCV 2018

Via

Access Paper or Ask Questions