Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Meysam Madadi

Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds

Jan 03, 2025

Simon B. Jensen, Stefan Oehmcke, Andreas Møgelmose, Meysam Madadi, Christian Igel, Sergio Escalera, Thomas B. Moeslund

Figure 1 for Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds

Figure 2 for Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds

Figure 3 for Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds

Figure 4 for Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds

Abstract:Accurate assessment of forest biodiversity is crucial for ecosystem management and conservation. While traditional field surveys provide high-quality assessments, they are labor-intensive and spatially limited. This study investigates whether deep learning-based fusion of close-range sensing data from 2D orthophotos (12.5 cm resolution) and 3D airborne laser scanning (ALS) point clouds (8 points/m^2) can enhance biodiversity assessment. We introduce the BioVista dataset, comprising 44.378 paired samples of orthophotos and ALS point clouds from temperate forests in Denmark, designed to explore multi-modal fusion approaches for biodiversity potential classification. Using deep neural networks (ResNet for orthophotos and PointVector for ALS point clouds), we investigate each data modality's ability to assess forest biodiversity potential, achieving mean accuracies of 69.4% and 72.8%, respectively. We explore two fusion approaches: a confidence-based ensemble method and a feature-level concatenation strategy, with the latter achieving a mean accuracy of 75.5%. Our results demonstrate that spectral information from orthophotos and structural information from ALS point clouds effectively complement each other in forest biodiversity assessment.

Via

Access Paper or Ask Questions

A Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Draping

Nov 05, 2023

Hunor Laczkó, Meysam Madadi, Sergio Escalera, Jordi Gonzalez

Abstract:RGB cloth generation has been deeply studied in the related literature, however, 3D garment generation remains an open problem. In this paper, we build a conditional variational autoencoder for 3D garment generation and draping. We propose a pyramid network to add garment details progressively in a canonical space, i.e. unposing and unshaping the garments w.r.t. the body. We study conditioning the network on surface normal UV maps, as an intermediate representation, which is an easier problem to optimize than 3D coordinates. Our results on two public datasets, CLOTH3D and CAPE, show that our model is robust, controllable in terms of detail generation by the use of multi-resolution pyramids, and achieves state-of-the-art results that can highly generalize to unseen garments, poses, and shapes even when training with small amounts of data.

* WACV24, IEEE copyright

Via

Access Paper or Ask Questions

Machine learning-based characterization of hydrochar from biomass: Implications for sustainable energy and material production

May 24, 2023

Alireza Shafizadeh, Hossein Shahbeik, Shahin Rafiee, Aysooda Moradi, Mohammadreza Shahbaz, Meysam Madadi, Cheng Li, Wanxi Peng, Meisam Tabatabaei, Mortaza Aghbashlo

Figure 1 for Machine learning-based characterization of hydrochar from biomass: Implications for sustainable energy and material production

Figure 2 for Machine learning-based characterization of hydrochar from biomass: Implications for sustainable energy and material production

Figure 3 for Machine learning-based characterization of hydrochar from biomass: Implications for sustainable energy and material production

Figure 4 for Machine learning-based characterization of hydrochar from biomass: Implications for sustainable energy and material production

Abstract:Hydrothermal carbonization (HTC) is a process that converts biomass into versatile hydrochar without the need for prior drying. The physicochemical properties of hydrochar are influenced by biomass properties and processing parameters, making it challenging to optimize for specific applications through trial-and-error experiments. To save time and money, machine learning can be used to develop a model that characterizes hydrochar produced from different biomass sources under varying reaction processing parameters. Thus, this study aims to develop an inclusive model to characterize hydrochar using a database covering a range of biomass types and reaction processing parameters. The quality and quantity of hydrochar are predicted using two models (decision tree regression and support vector regression). The decision tree regression model outperforms the support vector regression model in terms of forecast accuracy (R2 > 0.88, RMSE < 6.848, and MAE < 4.718). Using an evolutionary algorithm, optimum inputs are identified based on cost functions provided by the selected model to optimize hydrochar for energy production, soil amendment, and pollutant adsorption, resulting in hydrochar yields of 84.31%, 84.91%, and 80.40%, respectively. The feature importance analysis reveals that biomass ash/carbon content and operating temperature are the primary factors affecting hydrochar production in the HTC process.

* Fuel 347, 1 September 2023, 128467

Via

Access Paper or Ask Questions

Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images

Mar 15, 2023

Hugo Bertiche, Niloy J. Mitra, Kuldeep Kulkarni, Chun-Hao Paul Huang, Tuanfeng Y. Wang, Meysam Madadi, Sergio Escalera, Duygu Ceylan

Figure 1 for Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images

Figure 2 for Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images

Figure 3 for Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images

Figure 4 for Blowing in the Wind: CycleNet for Human Cinemagraphs from Still Images

Abstract:Cinemagraphs are short looping videos created by adding subtle motions to a static image. This kind of media is popular and engaging. However, automatic generation of cinemagraphs is an underexplored area and current solutions require tedious low-level manual authoring by artists. In this paper, we present an automatic method that allows generating human cinemagraphs from single RGB images. We investigate the problem in the context of dressed humans under the wind. At the core of our method is a novel cyclic neural network that produces looping cinemagraphs for the target loop duration. To circumvent the problem of collecting real data, we demonstrate that it is possible, by working in the image normal space, to learn garment motion dynamics on synthetic data and generalize to real data. We evaluate our method on both synthetic and real data and demonstrate that it is possible to create compelling and plausible cinemagraphs from single RGB images.

Via

Access Paper or Ask Questions

Neural Cloth Simulation

Dec 13, 2022

Hugo Bertiche, Meysam Madadi, Sergio Escalera

Abstract:We present a general framework for the garment animation problem through unsupervised deep learning inspired in physically based simulation. Existing trends in the literature already explore this possibility. Nonetheless, these approaches do not handle cloth dynamics. Here, we propose the first methodology able to learn realistic cloth dynamics unsupervisedly, and henceforth, a general formulation for neural cloth simulation. The key to achieve this is to adapt an existing optimization scheme for motion from simulation based methodologies to deep learning. Then, analyzing the nature of the problem, we devise an architecture able to automatically disentangle static and dynamic cloth subspaces by design. We will show how this improves model performance. Additionally, this opens the possibility of a novel motion augmentation technique that greatly improves generalization. Finally, we show it also allows to control the level of motion in the predictions. This is a useful, never seen before, tool for artists. We provide of detailed analysis of the problem to establish the bases of neural cloth simulation and guide future research into the specifics of this domain.

* Neural Cloth Simulation. ACM Trans. Graph. 41, 6, Article 220 (December 2022), 14 pages

Via

Access Paper or Ask Questions

Towards explaining the generalization gap in neural networks using topological data analysis

Mar 23, 2022

Rubén Ballester, Xavier Arnal Clemente, Carles Casacuberta, Meysam Madadi, Ciprian A. Corneanu, Sergio Escalera

Figure 1 for Towards explaining the generalization gap in neural networks using topological data analysis

Figure 2 for Towards explaining the generalization gap in neural networks using topological data analysis

Figure 3 for Towards explaining the generalization gap in neural networks using topological data analysis

Figure 4 for Towards explaining the generalization gap in neural networks using topological data analysis

Abstract:Understanding how neural networks generalize on unseen data is crucial for designing more robust and reliable models. In this paper, we study the generalization gap of neural networks using methods from topological data analysis. For this purpose, we compute homological persistence diagrams of weighted graphs constructed from neuron activation correlations after a training phase, aiming to capture patterns that are linked to the generalization capacity of the network. We compare the usefulness of different numerical summaries from persistence diagrams and show that a combination of some of them can accurately predict and partially explain the generalization gap without the need of a test set. Evaluation on two computer vision recognition tasks (CIFAR10 and SVHN) shows competitive generalization gap prediction when compared against state-of-the-art methods.

* 20 pages, 7 figures

Via

Access Paper or Ask Questions

Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

Jan 11, 2022

Zhengying Liu, Adrien Pavao, Zhen Xu, Sergio Escalera, Fabio Ferreira, Isabelle Guyon, Sirui Hong, Frank Hutter, Rongrong Ji, Julio C. S. Jacques Junior(+15 more)

Figure 1 for Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

Figure 2 for Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

Figure 3 for Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

Figure 4 for Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

Abstract:This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator". This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free "AutoDL self-service".

* IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) 2021
* The first three authors contributed equally; This is only a draft version

Via

Access Paper or Ask Questions

Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder

Nov 15, 2021

Joakim Bruslund Haurum, Meysam Madadi, Sergio Escalera, Thomas B. Moeslund

Figure 1 for Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder

Figure 2 for Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder

Figure 3 for Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder

Figure 4 for Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder

Abstract:The sewerage infrastructure is one of the most important and expensive infrastructures in modern society. In order to efficiently manage the sewerage infrastructure, automated sewer inspection has to be utilized. However, while sewer defect classification has been investigated for decades, little attention has been given to classifying sewer pipe properties such as water level, pipe material, and pipe shape, which are needed to evaluate the level of sewer pipe deterioration. In this work we classify sewer pipe defects and properties concurrently and present a novel decoder-focused multi-task classification architecture Cross-Task Graph Neural Network (CT-GNN), which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention. CT-GNN can be added to any backbone and trained end-to-end at a small increase in the parameter count. We achieve state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and water level classification by 5.3 and 8.0 percentage points, respectively. We also outperform the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous model-focused approaches. The code and models are available at the project page http://vap.aau.dk/ctgnn

* WACV 2022

Via

Access Paper or Ask Questions

Neural Implicit Surfaces for Efficient and Accurate Collisions in Physically Based Simulations

Oct 03, 2021

Hugo Bertiche, Meysam Madadi, Sergio Escalera

Figure 1 for Neural Implicit Surfaces for Efficient and Accurate Collisions in Physically Based Simulations

Figure 2 for Neural Implicit Surfaces for Efficient and Accurate Collisions in Physically Based Simulations

Figure 3 for Neural Implicit Surfaces for Efficient and Accurate Collisions in Physically Based Simulations

Figure 4 for Neural Implicit Surfaces for Efficient and Accurate Collisions in Physically Based Simulations

Abstract:Current trends in the computer graphics community propose leveraging the massive parallel computational power of GPUs to accelerate physically based simulations. Collision detection and solving is a fundamental part of this process. It is also the most significant bottleneck on physically based simulations and it easily becomes intractable as the number of vertices in the scene increases. Brute force approaches carry a quadratic growth in both computational time and memory footprint. While their parallelization is trivial in GPUs, their complexity discourages from using such approaches. Acceleration structures -- such as BVH -- are often applied to increase performance, achieving logarithmic computational times for individual point queries. Nonetheless, their memory footprint also grows rapidly and their parallelization in a GPU is problematic due to their branching nature. We propose using implicit surface representations learnt through deep learning for collision handling in physically based simulations. Our proposed architecture has a complexity of O(n) -- or O(1) for a single point query -- and has no parallelization issues. We will show how this permits accurate and efficient collision handling in physically based simulations, more specifically, for cloth. In our experiments, we query up to 1M points in 300 milliseconds.

Via

Access Paper or Ask Questions

ChaLearn Looking at People: Inpainting and Denoising challenges

Jun 24, 2021

Sergio Escalera, Marti Soler, Stephane Ayache, Umut Guclu, Jun Wan, Meysam Madadi, Xavier Baro, Hugo Jair Escalante, Isabelle Guyon

Figure 1 for ChaLearn Looking at People: Inpainting and Denoising challenges

Figure 2 for ChaLearn Looking at People: Inpainting and Denoising challenges

Figure 3 for ChaLearn Looking at People: Inpainting and Denoising challenges

Figure 4 for ChaLearn Looking at People: Inpainting and Denoising challenges

Abstract:Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence. However, in the context of computer vision, the problem has only been studied in specific scenarios (e.g., certain types of occlusions in specific types of images), although it is common to have incomplete information in visual data. This chapter describes the design of an academic competition focusing on inpainting of images and video sequences that was part of the competition program of WCCI2018 and had a satellite event collocated with ECCV2018. The ChaLearn Looking at People Inpainting Challenge aimed at advancing the state of the art on visual inpainting by promoting the development of methods for recovering missing and occluded information from images and video. Three tracks were proposed in which visual inpainting might be helpful but still challenging: human body pose estimation, text overlays removal and fingerprint denoising. This chapter describes the design of the challenge, which includes the release of three novel datasets, and the description of evaluation metrics, baselines and evaluation protocol. The results of the challenge are analyzed and discussed in detail and conclusions derived from this event are outlined.

* Inpainting and Denoising Challenges. The Springer Series on Challenges in Machine Learning. Springer, Cham. (2019)

Via

Access Paper or Ask Questions