Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhiqiang Tian

ERetinex: Event Camera Meets Retinex Theory for Low-Light Image Enhancement

Mar 04, 2025

Xuejian Guo, Zhiqiang Tian, Yuehang Wang, Siqi Li, Yu Jiang, Shaoyi Du, Yue Gao

Abstract:Low-light image enhancement aims to restore the under-exposure image captured in dark scenarios. Under such scenarios, traditional frame-based cameras may fail to capture the structure and color information due to the exposure time limitation. Event cameras are bio-inspired vision sensors that respond to pixel-wise brightness changes asynchronously. Event cameras' high dynamic range is pivotal for visual perception in extreme low-light scenarios, surpassing traditional cameras and enabling applications in challenging dark environments. In this paper, inspired by the success of the retinex theory for traditional frame-based low-light image restoration, we introduce the first methods that combine the retinex theory with event cameras and propose a novel retinex-based low-light image restoration framework named ERetinex. Among our contributions, the first is developing a new approach that leverages the high temporal resolution data from event cameras with traditional image information to estimate scene illumination accurately. This method outperforms traditional image-only techniques, especially in low-light environments, by providing more precise lighting information. Additionally, we propose an effective fusion strategy that combines the high dynamic range data from event cameras with the color information of traditional images to enhance image quality. Through this fusion, we can generate clearer and more detail-rich images, maintaining the integrity of visual information even under extreme lighting conditions. The experimental results indicate that our proposed method outperforms state-of-the-art (SOTA) methods, achieving a gain of 1.0613 dB in PSNR while reducing FLOPS by \textbf{84.28}\%.

* Accepted to ICRA 2025

Via

Access Paper or Ask Questions

SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field

Mar 21, 2024

Lizhe Liu, Bohua Wang, Hongwei Xie, Daqi Liu, Li Liu, Zhiqiang Tian, Kuiyuan Yang, Bing Wang

Abstract:Vision-centric 3D environment understanding is both vital and challenging for autonomous driving systems. Recently, object-free methods have attracted considerable attention. Such methods perceive the world by predicting the semantics of discrete voxel grids but fail to construct continuous and accurate obstacle surfaces. To this end, in this paper, we propose SurroundSDF to implicitly predict the signed distance field (SDF) and semantic field for the continuous perception from surround images. Specifically, we introduce a query-based approach and utilize SDF constrained by the Eikonal formulation to accurately describe the surfaces of obstacles. Furthermore, considering the absence of precise SDF ground truth, we propose a novel weakly supervised paradigm for SDF, referred to as the Sandwich Eikonal formulation, which emphasizes applying correct and dense constraints on both sides of the surface, thereby enhancing the perceptual accuracy of the surface. Experiments suggest that our method achieves SOTA for both occupancy prediction and 3D scene reconstruction tasks on the nuScenes dataset.

Via

Access Paper or Ask Questions

Watch Your Head: Assembling Projection Heads to Save the Reliability of Federated Models

Feb 26, 2024

Jinqian Chen, Jihua Zhu, Qinghai Zheng, Zhongyu Li, Zhiqiang Tian

Abstract:Federated learning encounters substantial challenges with heterogeneous data, leading to performance degradation and convergence issues. While considerable progress has been achieved in mitigating such an impact, the reliability aspect of federated models has been largely disregarded. In this study, we conduct extensive experiments to investigate the reliability of both generic and personalized federated models. Our exploration uncovers a significant finding: \textbf{federated models exhibit unreliability when faced with heterogeneous data}, demonstrating poor calibration on in-distribution test data and low uncertainty levels on out-of-distribution data. This unreliability is primarily attributed to the presence of biased projection heads, which introduce miscalibration into the federated models. Inspired by this observation, we propose the "Assembled Projection Heads" (APH) method for enhancing the reliability of federated models. By treating the existing projection head parameters as priors, APH randomly samples multiple initialized parameters of projection heads from the prior and further performs targeted fine-tuning on locally available data under varying learning rates. Such a head ensemble introduces parameter diversity into the deterministic model, eliminating the bias and producing reliable predictions via head averaging. We evaluate the effectiveness of the proposed APH method across three prominent federated benchmarks. Experimental results validate the efficacy of APH in model calibration and uncertainty estimation. Notably, APH can be seamlessly integrated into various federated approaches but only requires less than 30\% additional computation cost for 100$\times$ inferences within large models.

* Accepted in AAAI-24

Via

Access Paper or Ask Questions

Contrastive Label Enhancement

May 16, 2023

Yifei Wang, Yiyang Zhou, Jihua Zhu, Xinyuan Liu, Wenbiao Yan, Zhiqiang Tian

Abstract:Label distribution learning (LDL) is a new machine learning paradigm for solving label ambiguity. Since it is difficult to directly obtain label distributions, many studies are focusing on how to recover label distributions from logical labels, dubbed label enhancement (LE). Existing LE methods estimate label distributions by simply building a mapping relationship between features and label distributions under the supervision of logical labels. They typically overlook the fact that both features and logical labels are descriptions of the instance from different views. Therefore, we propose a novel method called Contrastive Label Enhancement (ConLE) which integrates features and logical labels into the unified projection space to generate high-level features by contrastive learning strategy. In this approach, features and logical labels belonging to the same sample are pulled closer, while those of different samples are projected farther away from each other in the projection space. Subsequently, we leverage the obtained high-level features to gain label distributions through a welldesigned training strategy that considers the consistency of label attributes. Extensive experiments on LDL benchmark datasets demonstrate the effectiveness and superiority of our method.

* 9 pages, 4 figures, published to IJCAI2023

Via

Access Paper or Ask Questions

HD2Reg: Hierarchical Descriptors and Detectors for Point Cloud Registration

May 05, 2023

Canhui Tang, Yiheng Li, Shaoyi Du, Guofa Wang, Zhiqiang Tian

Abstract:Feature Descriptors and Detectors are two main components of feature-based point cloud registration. However, little attention has been drawn to the explicit representation of local and global semantics in the learning of descriptors and detectors. In this paper, we present a framework that explicitly extracts dual-level descriptors and detectors and performs coarse-to-fine matching with them. First, to explicitly learn local and global semantics, we propose a hierarchical contrastive learning strategy, training the robust matching ability of high-level descriptors, and refining the local feature space using low-level descriptors. Furthermore, we propose to learn dual-level saliency maps that extract two groups of keypoints in two different senses. To overcome the weak supervision of binary matchability labels, we propose a ranking strategy to label the significance ranking of keypoints, and thus provide more fine-grained supervision signals. Finally, we propose a global-to-local matching scheme to obtain robust and accurate correspondences by leveraging the complementary dual-level features.Quantitative experiments on 3DMatch and KITTI odometry datasets show that our method achieves robust and accurate point cloud registration and outperforms recent keypoint-based methods.

* Accepted by IEEE Intelligent Vehicles Symposium 2023 (IV 2023)

Via

Access Paper or Ask Questions

3DMNDT:3D multi-view registration method based on the normal distributions transform

Mar 20, 2021

Jihua Zhu, Di Wang, Jiaxi Mu, Huimin Lu, Zhiqiang Tian, Zhongyu Li

Figure 1 for 3DMNDT:3D multi-view registration method based on the normal distributions transform

Figure 2 for 3DMNDT:3D multi-view registration method based on the normal distributions transform

Figure 3 for 3DMNDT:3D multi-view registration method based on the normal distributions transform

Figure 4 for 3DMNDT:3D multi-view registration method based on the normal distributions transform

Abstract:The normal distributions transform (NDT) is an effective paradigm for the point set registration. This method is originally designed for pair-wise registration and it will suffer from great challenges when applied to multi-view registration. Under the NDT framework, this paper proposes a novel multi-view registration method, named 3D multi-view registration based on the normal distributions transform (3DMNDT), which integrates the K-means clustering and Lie algebra solver to achieve multi-view registration. More specifically, the multi-view registration is cast into the problem of maximum likelihood estimation. Then, the K-means algorithm is utilized to divide all data points into different clusters, where a normal distribution is computed to locally models the probability of measuring a data point in each cluster. Subsequently, the registration problem is formulated by the NDT-based likelihood function. To maximize this likelihood function, the Lie algebra solver is developed to sequentially optimize each rigid transformation. The proposed method alternately implements data point clustering, NDT computing, and likelihood maximization until desired registration results are obtained. Experimental results tested on benchmark data sets illustrate that the proposed method can achieve state-of-the-art performance for multi-view registration.

Via

Access Paper or Ask Questions

Effective multi-view registration of point sets based on student's t mixture model

Dec 13, 2020

Yanlin Ma, Jihua Zhu, Zhongyu Li, Zhiqiang Tian, Yaochen Li

Figure 1 for Effective multi-view registration of point sets based on student's t mixture model

Figure 2 for Effective multi-view registration of point sets based on student's t mixture model

Figure 3 for Effective multi-view registration of point sets based on student's t mixture model

Figure 4 for Effective multi-view registration of point sets based on student's t mixture model

Abstract:Recently, Expectation-maximization (EM) algorithm has been introduced as an effective means to solve multi-view registration problem. Most of the previous methods assume that each data point is drawn from the Gaussian Mixture Model (GMM), which is difficult to deal with the noise with heavy-tail or outliers. Accordingly, this paper proposed an effective registration method based on Student's t Mixture Model (StMM). More specially, we assume that each data point is drawn from one unique StMM, where its nearest neighbors (NNs) in other point sets are regarded as the t-distribution centroids with equal covariances, membership probabilities, and fixed degrees of freedom. Based on this assumption, the multi-view registration problem is formulated into the maximization of the likelihood function including all rigid transformations. Subsequently, the EM algorithm is utilized to optimize rigid transformations as well as the only t-distribution covariance for multi-view registration. Since only a few model parameters require to be optimized, the proposed method is more likely to obtain the desired registration results. Besides, all t-distribution centroids can be obtained by the NN search method, it is very efficient to achieve multi-view registration. What's more, the t-distribution takes the noise with heavy-tail into consideration, which makes the proposed method be inherently robust to noises and outliers. Experimental results tested on benchmark data sets illustrate its superior performance on robustness and accuracy over state-of-the-art methods.

* 11pages, 6 figures

Via

Access Paper or Ask Questions

Multi-view Subspace Clustering Networks with Local and Global Graph Information

Oct 24, 2020

Qinghai Zheng, Jihua Zhu, Yuanyuan Ma, Zhongyu Li, Zhiqiang Tian

Figure 1 for Multi-view Subspace Clustering Networks with Local and Global Graph Information

Figure 2 for Multi-view Subspace Clustering Networks with Local and Global Graph Information

Figure 3 for Multi-view Subspace Clustering Networks with Local and Global Graph Information

Figure 4 for Multi-view Subspace Clustering Networks with Local and Global Graph Information

Abstract:This study investigates the problem of multi-view subspace clustering, the goal of which is to explore the underlying grouping structure of data collected from different fields or measurements. Since data do not always comply with the linear subspace models in many real-world applications, most existing multi-view subspace clustering methods that based on the shallow linear subspace models may fail in practice. Furthermore, underlying graph information of multi-view data is always ignored in most existing multi-view subspace clustering methods. To address aforementioned limitations, we proposed the novel multi-view subspace clustering networks with local and global graph information, termed MSCNLG, in this paper. Specifically, autoencoder networks are employed on multiple views to achieve latent smooth representations that are suitable for the linear assumption. Simultaneously, by integrating fused multi-view graph information into self-expressive layers, the proposed MSCNLG obtains the common shared multi-view subspace representation, which can be used to get clustering results by employing the standard spectral clustering algorithm. As an end-to-end trainable framework, the proposed method fully investigates the valuable information of multiple views. Comprehensive experiments on six benchmark datasets validate the effectiveness and superiority of the proposed MSCNLG.

Via

Access Paper or Ask Questions

REGNet: REgion-based Grasp Network for Single-shot Grasp Detection in Point Clouds

Mar 03, 2020

Binglei Zhao, Hanbo Zhang, Xuguang Lan, Haoyu Wang, Zhiqiang Tian, Nanning Zheng

Figure 1 for REGNet: REgion-based Grasp Network for Single-shot Grasp Detection in Point Clouds

Figure 2 for REGNet: REgion-based Grasp Network for Single-shot Grasp Detection in Point Clouds

Figure 3 for REGNet: REgion-based Grasp Network for Single-shot Grasp Detection in Point Clouds

Figure 4 for REGNet: REgion-based Grasp Network for Single-shot Grasp Detection in Point Clouds

Abstract:Learning a robust representation of robotic grasping from point clouds is a crucial but challenging task. In this paper, we propose an end-to-end single-shot grasp detection network taking one single-view point cloud as input for parallel grippers. Our network includes three stages: Score Network (SN), Grasp Region Network (GRN) and Refine Network (RN). Specifically, SN is designed to select positive points with high grasp confidence. GRN coarsely generates a set of grasp proposals on selected positive points. Finally, RN refines the detected grasps based on local grasp features. To further improve the performance, we propose a grasp anchor mechanism, in which grasp anchors are introduced to generate grasp proposal. Moreover, we contribute a large-scale grasp dataset without manual annotation based on the YCB dataset. Experiments show that our method significantly outperforms several successful point-cloud based grasp detection methods including GPD, PointnetGPD, as well as S$^4$G.

Via

Access Paper or Ask Questions

Visual Space Optimization for Zero-shot Learning

Jun 30, 2019

Xinsheng Wang, Shanmin Pang, Jihua Zhu, Zhongyu Li, Zhiqiang Tian, Yaochen Li

Figure 1 for Visual Space Optimization for Zero-shot Learning

Figure 2 for Visual Space Optimization for Zero-shot Learning

Figure 3 for Visual Space Optimization for Zero-shot Learning

Figure 4 for Visual Space Optimization for Zero-shot Learning

Abstract:Zero-shot learning, which aims to recognize new categories that are not included in the training set, has gained popularity owing to its potential ability in the real-word applications. Zero-shot learning models rely on learning an embedding space, where both semantic descriptions of classes and visual features of instances can be embedded for nearest neighbor search. Recently, most of the existing works consider the visual space formulated by deep visual features as an ideal choice of the embedding space. However, the discrete distribution of instances in the visual space makes the data structure unremarkable. We argue that optimizing the visual space is crucial as it allows semantic vectors to be embedded into the visual space more effectively. In this work, we propose two strategies to accomplish this purpose. One is the visual prototype based method, which learns a visual prototype for each visual class, so that, in the visual space, a class can be represented by a prototype feature instead of a series of discrete visual features. The other is to optimize the visual feature structure in an intermediate embedding space, and in this method we successfully devise a multilayer perceptron framework based algorithm that is able to learn the common intermediate embedding space and meanwhile to make the visual data structure more distinctive. Through extensive experimental evaluation on four benchmark datasets, we demonstrate that optimizing visual space is beneficial for zero-shot learning. Besides, the proposed prototype based method achieves the new state-of-the-art performance.

Via

Access Paper or Ask Questions