Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chaonan Ji

Exploring Timeline Control for Facial Motion Generation

May 27, 2025

Yifeng Ma, Jinwei Qi, Chaonan Ji, Peng Zhang, Bang Zhang, Zhidong Deng, Liefeng Bo

Abstract:This paper introduces a new control signal for facial motion generation: timeline control. Compared to audio and text signals, timelines provide more fine-grained control, such as generating specific facial motions with precise timing. Users can specify a multi-track timeline of facial actions arranged in temporal intervals, allowing precise control over the timing of each action. To model the timeline control capability, We first annotate the time intervals of facial actions in natural facial motion sequences at a frame-level granularity. This process is facilitated by Toeplitz Inverse Covariance-based Clustering to minimize human labor. Based on the annotations, we propose a diffusion-based generation model capable of generating facial motions that are natural and accurately aligned with input timelines. Our method supports text-guided motion generation by using ChatGPT to convert text into timelines. Experimental results show that our method can annotate facial action intervals with satisfactory accuracy, and produces natural facial motions accurately aligned with timelines.

* Accepted by CVPR 2025, Project Page: https://humanaigc.github.io/facial-motion-timeline-control/

Via

Access Paper or Ask Questions

ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

Mar 27, 2025

Jinwei Qi, Chaonan Ji, Sheng Xu, Peng Zhang, Bang Zhang, Liefeng Bo

Abstract:Real-time interactive video-chat portraits have been increasingly recognized as the future trend, particularly due to the remarkable progress made in text and voice chat technologies. However, existing methods primarily focus on real-time generation of head movements, but struggle to produce synchronized body motions that match these head actions. Additionally, achieving fine-grained control over the speaking style and nuances of facial expressions remains a challenge. To address these limitations, we introduce a novel framework for stylized real-time portrait video generation, enabling expressive and flexible video chat that extends from talking head to upper-body interaction. Our approach consists of the following two stages. The first stage involves efficient hierarchical motion diffusion models, that take both explicit and implicit motion representations into account based on audio inputs, which can generate a diverse range of facial expressions with stylistic control and synchronization between head and body movements. The second stage aims to generate portrait video featuring upper-body movements, including hand gestures. We inject explicit hand control signals into the generator to produce more detailed hand movements, and further perform face refinement to enhance the overall realism and expressiveness of the portrait video. Additionally, our approach supports efficient and continuous generation of upper-body portrait video in maximum 512 * 768 resolution at up to 30fps on 4090 GPU, supporting interactive video-chat in real-time. Experimental results demonstrate the capability of our approach to produce portrait videos with rich expressiveness and natural upper-body movements.

* Project Page: https://humanaigc.github.io/chat-anyone/

Via

Access Paper or Ask Questions

Explainable Earth Surface Forecasting under Extreme Events

Oct 02, 2024

Oscar J. Pellicer-Valero, Miguel-Ángel Fernández-Torres, Chaonan Ji, Miguel D. Mahecha, Gustau Camps-Valls

Figure 1 for Explainable Earth Surface Forecasting under Extreme Events

Figure 2 for Explainable Earth Surface Forecasting under Extreme Events

Figure 3 for Explainable Earth Surface Forecasting under Extreme Events

Figure 4 for Explainable Earth Surface Forecasting under Extreme Events

Abstract:With climate change-related extreme events on the rise, high dimensional Earth observation data presents a unique opportunity for forecasting and understanding impacts on ecosystems. This is, however, impeded by the complexity of processing, visualizing, modeling, and explaining this data. To showcase how this challenge can be met, here we train a convolutional long short-term memory-based architecture on the novel DeepExtremeCubes dataset. DeepExtremeCubes includes around 40,000 long-term Sentinel-2 minicubes (January 2016-October 2022) worldwide, along with labeled extreme events, meteorological data, vegetation land cover, and topography map, sampled from locations affected by extreme climate events and surrounding areas. When predicting future reflectances and vegetation impacts through kernel normalized difference vegetation index, the model achieved an R$^2$ score of 0.9055 in the test set. Explainable artificial intelligence was used to analyze the model's predictions during the October 2020 Central South America compound heatwave and drought event. We chose the same area exactly one year before the event as counterfactual, finding that the average temperature and surface pressure are generally the best predictors under normal conditions. In contrast, minimum anomalies of evaporation and surface latent heat flux take the lead during the event. A change of regime is also observed in the attributions before the event, which might help assess how long the event was brewing before happening. The code to replicate all experiments and figures in this paper is publicly available at https://github.com/DeepExtremes/txyXAI

Via

Access Paper or Ask Questions

Earth System Data Cubes: Avenues for advancing Earth system research

Aug 05, 2024

David Montero, Guido Kraemer, Anca Anghelea, César Aybar, Gunnar Brandt, Gustau Camps-Valls, Felix Cremer, Ida Flik, Fabian Gans, Sarah Habershon(+8 more)

Abstract:Recent advancements in Earth system science have been marked by the exponential increase in the availability of diverse, multivariate datasets characterised by moderate to high spatio-temporal resolutions. Earth System Data Cubes (ESDCs) have emerged as one suitable solution for transforming this flood of data into a simple yet robust data structure. ESDCs achieve this by organising data into an analysis-ready format aligned with a spatio-temporal grid, facilitating user-friendly analysis and diminishing the need for extensive technical data processing knowledge. Despite these significant benefits, the completion of the entire ESDC life cycle remains a challenging task. Obstacles are not only of a technical nature but also relate to domain-specific problems in Earth system research. There exist barriers to realising the full potential of data collections in light of novel cloud-based technologies, particularly in curating data tailored for specific application domains. These include transforming data to conform to a spatio-temporal grid with minimum distortions and managing complexities such as spatio-temporal autocorrelation issues. Addressing these challenges is pivotal for the effective application of Artificial Intelligence (AI) approaches. Furthermore, adhering to open science principles for data dissemination, reproducibility, visualisation, and reuse is crucial for fostering sustainable research. Overcoming these challenges offers a substantial opportunity to advance data-driven Earth system research, unlocking the full potential of an integrated, multidimensional view of Earth system processes. This is particularly true when such research is coupled with innovative research paradigms and technological progress.

Via

Access Paper or Ask Questions

DeepExtremeCubes: Integrating Earth system spatio-temporal data for impact assessment of climate extremes

Jun 26, 2024

Chaonan Ji, Tonio Fincke, Vitus Benson, Gustau Camps-Valls, Miguel-Angel Fernandez-Torres, Fabian Gans, Guido Kraemer, Francesco Martinuzzi, David Montero, Karin Mora(+5 more)

Figure 1 for DeepExtremeCubes: Integrating Earth system spatio-temporal data for impact assessment of climate extremes

Figure 2 for DeepExtremeCubes: Integrating Earth system spatio-temporal data for impact assessment of climate extremes

Figure 3 for DeepExtremeCubes: Integrating Earth system spatio-temporal data for impact assessment of climate extremes

Figure 4 for DeepExtremeCubes: Integrating Earth system spatio-temporal data for impact assessment of climate extremes

Abstract:With climate extremes' rising frequency and intensity, robust analytical tools are crucial to predict their impacts on terrestrial ecosystems. Machine learning techniques show promise but require well-structured, high-quality, and curated analysis-ready datasets. Earth observation datasets comprehensively monitor ecosystem dynamics and responses to climatic extremes, yet the data complexity can challenge the effectiveness of machine learning models. Despite recent progress in deep learning to ecosystem monitoring, there is a need for datasets specifically designed to analyse compound heatwave and drought extreme impact. Here, we introduce the DeepExtremeCubes database, tailored to map around these extremes, focusing on persistent natural vegetation. It comprises over 40,000 spatially sampled small data cubes (i.e. minicubes) globally, with a spatial coverage of 2.5 by 2.5 km. Each minicube includes (i) Sentinel-2 L2A images, (ii) ERA5-Land variables and generated extreme event cube covering 2016 to 2022, and (iii) ancillary land cover and topography maps. The paper aims to (1) streamline data accessibility, structuring, pre-processing, and enhance scientific reproducibility, and (2) facilitate biosphere dynamics forecasting in response to compound extremes.

Via

Access Paper or Ask Questions

On-Demand Earth System Data Cubes

Apr 19, 2024

David Montero, César Aybar, Chaonan Ji, Guido Kraemer, Maximilian Söchting, Khalil Teber, Miguel D. Mahecha

Figure 1 for On-Demand Earth System Data Cubes

Figure 2 for On-Demand Earth System Data Cubes

Figure 3 for On-Demand Earth System Data Cubes

Figure 4 for On-Demand Earth System Data Cubes

Abstract:Advancements in Earth system science have seen a surge in diverse datasets. Earth System Data Cubes (ESDCs) have been introduced to efficiently handle this influx of high-dimensional data. ESDCs offer a structured, intuitive framework for data analysis, organising information within spatio-temporal grids. The structured nature of ESDCs unlocks significant opportunities for Artificial Intelligence (AI) applications. By providing well-organised data, ESDCs are ideally suited for a wide range of sophisticated AI-driven tasks. An automated framework for creating AI-focused ESDCs with minimal user input could significantly accelerate the generation of task-specific training data. Here we introduce cubo, an open-source Python tool designed for easy generation of AI-focused ESDCs. Utilising collections in SpatioTemporal Asset Catalogs (STAC) that are stored as Cloud Optimised GeoTIFFs (COGs), cubo efficiently creates ESDCs, requiring only central coordinates, spatial resolution, edge size, and time range.

* Accepted at IGARSS24

Via

Access Paper or Ask Questions

Geometry-aware Single-image Full-body Human Relighting

Jul 12, 2022

Chaonan Ji, Tao Yu, Kaiwen Guo, Jingxin Liu, Yebin Liu

Figure 1 for Geometry-aware Single-image Full-body Human Relighting

Figure 2 for Geometry-aware Single-image Full-body Human Relighting

Figure 3 for Geometry-aware Single-image Full-body Human Relighting

Figure 4 for Geometry-aware Single-image Full-body Human Relighting

Abstract:Single-image human relighting aims to relight a target human under new lighting conditions by decomposing the input image into albedo, shape and lighting. Although plausible relighting results can be achieved, previous methods suffer from both the entanglement between albedo and lighting and the lack of hard shadows, which significantly decrease the realism. To tackle these two problems, we propose a geometry-aware single-image human relighting framework that leverages single-image geometry reconstruction for joint deployment of traditional graphics rendering and neural rendering techniques. For the de-lighting, we explore the shortcomings of UNet architecture and propose a modified HRNet, achieving better disentanglement between albedo and lighting. For the relighting, we introduce a ray tracing-based per-pixel lighting representation that explicitly models high-frequency shadows and propose a learning-based shading refinement module to restore realistic shadows (including hard cast shadows) from the ray-traced shading maps. Our framework is able to generate photo-realistic high-frequency shadows such as cast shadows under challenging lighting conditions. Extensive experiments demonstrate that our proposed method outperforms previous methods on both synthetic and real images.

* accepted by ECCV2022

Via

Access Paper or Ask Questions

AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture

Jul 12, 2022

Zhe Li, Zerong Zheng, Hongwen Zhang, Chaonan Ji, Yebin Liu

Abstract:To address the ill-posed problem caused by partial observations in monocular human volumetric capture, we present AvatarCap, a novel framework that introduces animatable avatars into the capture pipeline for high-fidelity reconstruction in both visible and invisible regions. Our method firstly creates an animatable avatar for the subject from a small number (~20) of 3D scans as a prior. Then given a monocular RGB video of this subject, our method integrates information from both the image observation and the avatar prior, and accordingly recon-structs high-fidelity 3D textured models with dynamic details regardless of the visibility. To learn an effective avatar for volumetric capture from only few samples, we propose GeoTexAvatar, which leverages both geometry and texture supervisions to constrain the pose-dependent dynamics in a decomposed implicit manner. An avatar-conditioned volumetric capture method that involves a canonical normal fusion and a reconstruction network is further proposed to integrate both image observations and avatar dynamics for high-fidelity reconstruction in both observed and invisible regions. Overall, our method enables monocular human volumetric capture with detailed and pose-dependent dynamics, and the experiments show that our method outperforms state of the art. Code is available at https://github.com/lizhe00/AvatarCap.

* Accepted by ECCV 2022, project page: http://www.liuyebin.com/avatarcap/avatarcap.html, code: https://github.com/lizhe00/AvatarCap

Via

Access Paper or Ask Questions

Vehicle Reconstruction and Texture Estimation Using Deep Implicit Semantic Template Mapping

Nov 30, 2020

Xiaochen Zhao, Zerong Zheng, Chaonan Ji, Zhenyi Liu, Yirui Luo, Tao Yu, Jinli Suo, Qionghai Dai, Yebin Liu

Figure 1 for Vehicle Reconstruction and Texture Estimation Using Deep Implicit Semantic Template Mapping

Figure 2 for Vehicle Reconstruction and Texture Estimation Using Deep Implicit Semantic Template Mapping

Figure 3 for Vehicle Reconstruction and Texture Estimation Using Deep Implicit Semantic Template Mapping

Figure 4 for Vehicle Reconstruction and Texture Estimation Using Deep Implicit Semantic Template Mapping

Abstract:We introduce VERTEX, an effective solution to recover 3D shape and intrinsic texture of vehicles from uncalibrated monocular input in real-world street environments. To fully utilize the template prior of vehicles, we propose a novel geometry and texture joint representation, based on implicit semantic template mapping. Compared to existing representations which infer 3D texture distribution, our method explicitly constrains the texture distribution on the 2D surface of the template as well as avoids limitations of fixed resolution and topology. Moreover, by fusing the global and local features together, our approach is capable to generate consistent and detailed texture in both visible and invisible areas. We also contribute a new synthetic dataset containing 830 elaborate textured car models labeled with sparse key points and rendered using Physically Based Rendering (PBRT) system with measured HDRI skymaps to obtain highly realistic images. Experiments demonstrate the superior performance of our approach on both testing dataset and in-the-wild images. Furthermore, the presented technique enables additional applications such as 3D vehicle texture transfer and material identification.

Via

Access Paper or Ask Questions