Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ziteng Cui

Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Apr 02, 2025

Ziteng Cui, Xuangeng Chu, Tatsuya Harada

Figure 1 for Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Figure 2 for Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Figure 3 for Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Figure 4 for Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment

Abstract:Capturing high-quality photographs under diverse real-world lighting conditions is challenging, as both natural lighting (e.g., low-light) and camera exposure settings (e.g., exposure time) significantly impact image quality. This challenge becomes more pronounced in multi-view scenarios, where variations in lighting and image signal processor (ISP) settings across viewpoints introduce photometric inconsistencies. Such lighting degradations and view-dependent variations pose substantial challenges to novel view synthesis (NVS) frameworks based on Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). To address this, we introduce Luminance-GS, a novel approach to achieving high-quality novel view synthesis results under diverse challenging lighting conditions using 3DGS. By adopting per-view color matrix mapping and view-adaptive curve adjustments, Luminance-GS achieves state-of-the-art (SOTA) results across various lighting conditions -- including low-light, overexposure, and varying exposure -- while not altering the original 3DGS explicit representation. Compared to previous NeRF- and 3DGS-based baselines, Luminance-GS provides real-time rendering speed with improved reconstruction quality.

* CVPR 2025, project page: https://cuiziteng.github.io/Luminance_GS_web/

Via

Access Paper or Ask Questions

ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model

Feb 28, 2025

Xuangeng Chu, Nabarun Goswami, Ziteng Cui, Hanqin Wang, Tatsuya Harada

Figure 1 for ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model

Figure 2 for ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model

Figure 3 for ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model

Figure 4 for ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model

Abstract:Speech-driven 3D facial animation aims to generate realistic lip movements and facial expressions for 3D head models from arbitrary audio clips. Although existing diffusion-based methods are capable of producing natural motions, their slow generation speed limits their application potential. In this paper, we introduce a novel autoregressive model that achieves real-time generation of highly synchronized lip movements and realistic head poses and eye blinks by learning a mapping from speech to a multi-scale motion codebook. Furthermore, our model can adapt to unseen speaking styles using sample motion sequences, enabling the creation of 3D talking avatars with unique personal styles beyond the identities seen during training. Extensive evaluations and user studies demonstrate that our method outperforms existing approaches in lip synchronization accuracy and perceived quality.

* More video demonstrations, code, models and data can be found on our project website: http://xg-chu.site/project_artalk/

Via

Access Paper or Ask Questions

Discovering an Image-Adaptive Coordinate System for Photography Processing

Jan 11, 2025

Ziteng Cui, Lin Gu, Tatsuya Harada

Abstract:Curve & Lookup Table (LUT) based methods directly map a pixel to the target output, making them highly efficient tools for real-time photography processing. However, due to extreme memory complexity to learn full RGB space mapping, existing methods either sample a discretized 3D lattice to build a 3D LUT or decompose into three separate curves (1D LUTs) on the RGB channels. Here, we propose a novel algorithm, IAC, to learn an image-adaptive Cartesian coordinate system in the RGB color space before performing curve operations. This end-to-end trainable approach enables us to efficiently adjust images with a jointly learned image-adaptive coordinate system and curves. Experimental results demonstrate that this simple strategy achieves state-of-the-art (SOTA) performance in various photography processing tasks, including photo retouching, exposure correction, and white-balance editing, while also maintaining a lightweight design and fast inference speed.

* BMVC 2024

Via

Access Paper or Ask Questions

Emergence of Painting Ability via Recognition-Driven Evolution

Jan 09, 2025

Yi Lin, Lin Gu, Ziteng Cui, Shenghan Su, Yumo Hao, Yingtao Tian, Tatsuya Harada, Jianfei Yang

Abstract:From Paleolithic cave paintings to Impressionism, human painting has evolved to depict increasingly complex and detailed scenes, conveying more nuanced messages. This paper attempts to emerge this artistic capability by simulating the evolutionary pressures that enhance visual communication efficiency. Specifically, we present a model with a stroke branch and a palette branch that together simulate human-like painting. The palette branch learns a limited colour palette, while the stroke branch parameterises each stroke using B\'ezier curves to render an image, subsequently evaluated by a high-level recognition module. We quantify the efficiency of visual communication by measuring the recognition accuracy achieved with machine vision. The model then optimises the control points and colour choices for each stroke to maximise recognition accuracy with minimal strokes and colours. Experimental results show that our model achieves superior performance in high-level recognition tasks, delivering artistic expression and aesthetic appeal, especially in abstract sketches. Additionally, our approach shows promise as an efficient bit-level image compression technique, outperforming traditional methods.

Via

Access Paper or Ask Questions

Paleoinspired Vision: From Exploring Colour Vision Evolution to Inspiring Camera Design

Dec 27, 2024

Junjie Zhang, Zhimin Zong, Lin Gu, Shenghan Su, Ziteng Cui, Yan Pu, Zirui Chen, Jing Lu, Daisuke Kojima, Tatsuya Harada(+1 more)

Figure 1 for Paleoinspired Vision: From Exploring Colour Vision Evolution to Inspiring Camera Design

Figure 2 for Paleoinspired Vision: From Exploring Colour Vision Evolution to Inspiring Camera Design

Figure 3 for Paleoinspired Vision: From Exploring Colour Vision Evolution to Inspiring Camera Design

Figure 4 for Paleoinspired Vision: From Exploring Colour Vision Evolution to Inspiring Camera Design

Abstract:The evolution of colour vision is captivating, as it reveals the adaptive strategies of extinct species while simultaneously inspiring innovations in modern imaging technology. In this study, we present a simplified model of visual transduction in the retina, introducing a novel opsin layer. We quantify evolutionary pressures by measuring machine vision recognition accuracy on colour images shaped by specific opsins. Building on this, we develop an evolutionary conservation optimisation algorithm to reconstruct the spectral sensitivity of opsins, enabling mutation-driven adaptations to to more effectively spot fruits or predators. This model condenses millions of years of evolution within seconds on GPU, providing an experimental framework to test long-standing hypotheses in evolutionary biology , such as vision of early mammals, primate trichromacy from gene duplication, retention of colour blindness, blue-shift of fish rod and multiple rod opsins with bioluminescence. Moreover, the model enables speculative explorations of hypothetical species, such as organisms with eyes adapted to the conditions on Mars. Our findings suggest a minimalist yet effective approach to task-specific camera filter design, optimising the spectral response function to meet application-driven demands. The code will be made publicly available upon acceptance.

* 15 pages, 6 figures

Via

Access Paper or Ask Questions

RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images

Aug 27, 2024

Ziteng Cui, Tatsuya Harada

Figure 1 for RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images

Figure 2 for RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images

Figure 3 for RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images

Figure 4 for RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images

Abstract:sRGB images are now the predominant choice for pre-training visual models in computer vision research, owing to their ease of acquisition and efficient storage. Meanwhile, the advantage of RAW images lies in their rich physical information under variable real-world challenging lighting conditions. For computer vision tasks directly based on camera RAW data, most existing studies adopt methods of integrating image signal processor (ISP) with backend networks, yet often overlook the interaction capabilities between the ISP stages and subsequent networks. Drawing inspiration from ongoing adapter research in NLP and CV areas, we introduce RAW-Adapter, a novel approach aimed at adapting sRGB pre-trained models to camera RAW data. RAW-Adapter comprises input-level adapters that employ learnable ISP stages to adjust RAW inputs, as well as model-level adapters to build connections between ISP stages and subsequent high-level networks. Additionally, RAW-Adapter is a general framework that could be used in various computer vision frameworks. Abundant experiments under different lighting conditions have shown our algorithm's state-of-the-art (SOTA) performance, demonstrating its effectiveness and efficiency across a range of real-world and synthetic datasets.

* ECCV 2024, code link: https://github.com/cuiziteng/ECCV_RAW_Adapter

Via

Access Paper or Ask Questions

Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Dec 15, 2023

Ziteng Cui, Lin Gu, Xiao Sun, Xianzheng Ma, Yu Qiao, Tatsuya Harada

Figure 1 for Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Figure 2 for Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Figure 3 for Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Figure 4 for Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

Abstract:The standard Neural Radiance Fields (NeRF) paradigm employs a viewer-centered methodology, entangling the aspects of illumination and material reflectance into emission solely from 3D points. This simplified rendering approach presents challenges in accurately modeling images captured under adverse lighting conditions, such as low light or over-exposure. Motivated by the ancient Greek emission theory that posits visual perception as a result of rays emanating from the eyes, we slightly refine the conventional NeRF framework to train NeRF under challenging light conditions and generate normal-light condition novel views unsupervised. We introduce the concept of a "Concealing Field," which assigns transmittance values to the surrounding air to account for illumination effects. In dark scenarios, we assume that object emissions maintain a standard lighting level but are attenuated as they traverse the air during the rendering process. Concealing Field thus compel NeRF to learn reasonable density and colour estimations for objects even in dimly lit situations. Similarly, the Concealing Field can mitigate over-exposed emissions during the rendering stage. Furthermore, we present a comprehensive multi-view dataset captured under challenging illumination conditions for evaluation. Our code and dataset available at https://github.com/cuiziteng/Aleth-NeRF

* AAAI 2024, code available at https://github.com/cuiziteng/Aleth-NeRF Modified version of previous paper arXiv:2303.05807

Via

Access Paper or Ask Questions

Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

Mar 10, 2023

Ziteng Cui, Lin Gu, Xiao Sun, Yu Qiao, Tatsuya Harada

Figure 1 for Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

Figure 2 for Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

Figure 3 for Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

Figure 4 for Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

Abstract:Common capture low-light scenes are challenging for most computer vision techniques, including Neural Radiance Fields (NeRF). Vanilla NeRF is viewer-centred that simplifies the rendering process only as light emission from 3D locations in the viewing direction, thus failing to model the low-illumination induced darkness. Inspired by emission theory of ancient Greek that visual perception is accomplished by rays casting from eyes, we make slight modifications on vanilla NeRF to train on multiple views of low-light scene, we can thus render out the well-lit scene in an unsupervised manner. We introduce a surrogate concept, Concealing Fields, that reduce the transport of light during the volume rendering stage. Specifically, our proposed method, Aleth-NeRF, directly learns from the dark image to understand volumetric object representation and concealing field under priors. By simply eliminating Concealing Fields, we can render a single or multi-view well-lit image(s) and gain superior performance over other 2D low light enhancement methods. Additionally, we collect the first paired LOw-light and normal-light Multi-view (LOM) datasets for future research.

* website page: https://cuiziteng.github.io/Aleth_NeRF_web/

Via

Access Paper or Ask Questions

Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer

Dec 07, 2022

Shenghan Su, Lin Gu, Ziteng Cui, Yue Yang, Jingjing Shen, Hiroaki Yamane, Zenghui Zhang, Tatsuya Harada

Abstract:The long-standing theory that a colour-naming system evolves under the dual pressure of efficient communication and perceptual mechanism is supported by more and more linguistic studies including the analysis of four decades' diachronic data from the Nafaanra language. This inspires us to explore whether artificial intelligence could evolve and discover a similar colour-naming system via optimising the communication efficiency represented by high-level recognition performance. Here, we propose a novel colour quantisation transformer, CQFormer, that quantises colour space while maintaining the accuracy of machine recognition on the quantised images. Given an RGB image, Annotation Branch maps it into an index map before generating the quantised image with a colour palette, meanwhile the Palette Branch utilises a key-point detection way to find proper colours in palette among whole colour space. By interacting with colour annotation, CQFormer is able to balance both the machine vision accuracy and colour perceptual structure such as distinct and stable colour distribution for discovered colour system. Very interestingly, we even observe the consistent evolution pattern between our artificial colour system and basic colour terms across human languages. Besides, our colour quantisation method also offers an efficient quantisation method that effectively compresses the image storage while maintaining a high performance in high-level recognition tasks such as classification and detection. Extensive experiments demonstrate the superior performance of our method with extremely low bit-rate colours. We will release the source code soon.

Via

Access Paper or Ask Questions

Improving Fairness in Image Classification via Sketching

Oct 31, 2022

Ruichen Yao, Ziteng Cui, Xiaoxiao Li, Lin Gu

Abstract:Fairness is a fundamental requirement for trustworthy and human-centered Artificial Intelligence (AI) system. However, deep neural networks (DNNs) tend to make unfair predictions when the training data are collected from different sub-populations with different attributes (i.e. color, sex, age), leading to biased DNN predictions. We notice that such a troubling phenomenon is often caused by data itself, which means that bias information is encoded to the DNN along with the useful information (i.e. class information, semantic information). Therefore, we propose to use sketching to handle this phenomenon. Without losing the utility of data, we explore the image-to-sketching methods that can maintain useful semantic information for the target classification while filtering out the useless bias information. In addition, we design a fair loss to further improve the model fairness. We evaluate our method through extensive experiments on both general scene dataset and medical scene dataset. Our results show that the desired image-to-sketching method improves model fairness and achieves satisfactory results among state-of-the-art.

* 8 pages, 2 figures. To appear in 2022 Trustworthy and Socially Responsible Machine Learning (TSRML 2022) co-located with NeurIPS 2022

Via

Access Paper or Ask Questions