Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zongben Xu

GMapLatent: Geometric Mapping in Latent Space

Mar 30, 2025

Wei Zeng, Xuebin Chang, Jianghao Su, Xiang Gu, Jian Sun, Zongben Xu

Abstract:Cross-domain generative models based on encoder-decoder AI architectures have attracted much attention in generating realistic images, where domain alignment is crucial for generation accuracy. Domain alignment methods usually deal directly with the initial distribution; however, mismatched or mixed clusters can lead to mode collapse and mixture problems in the decoder, compromising model generalization capabilities. In this work, we innovate a cross-domain alignment and generation model that introduces a canonical latent space representation based on geometric mapping to align the cross-domain latent spaces in a rigorous and precise manner, thus avoiding mode collapse and mixture in the encoder-decoder generation architectures. We name this model GMapLatent. The core of the method is to seamlessly align latent spaces with strict cluster correspondence constraints using the canonical parameterizations of cluster-decorated latent spaces. We first (1) transform the latent space to a canonical parameter domain by composing barycenter translation, optimal transport merging and constrained harmonic mapping, and then (2) compute geometric registration with cluster constraints over the canonical parameter domains. This process realizes a bijective (one-to-one and onto) mapping between newly transformed latent spaces and generates a precise alignment of cluster pairs. Cross-domain generation is then achieved through the aligned latent spaces embedded in the encoder-decoder pipeline. Experiments on gray-scale and color images validate the efficiency, efficacy and applicability of GMapLatent, and demonstrate that the proposed model has superior performance over existing models.

Via

Access Paper or Ask Questions

TelOps: AI-driven Operations and Maintenance for Telecommunication Networks

Dec 06, 2024

Yuqian Yang, Shusen Yang, Cong Zhao, Zongben Xu

Abstract:Telecommunication Networks (TNs) have become the most important infrastructure for data communications over the last century. Operations and maintenance (O&M) is extremely important to ensure the availability, effectiveness, and efficiency of TN communications. Different from the popular O&M technique for IT systems (e.g., the cloud), artificial intelligence for IT Operations (AIOps), O&M for TNs meets the following three fundamental challenges: topological dependence of network components, highly heterogeneous software, and restricted failure data. This article presents TelOps, the first AI-driven O&M framework for TNs, systematically enhanced with mechanism, data, and empirical knowledge. We provide a comprehensive comparison between TelOps and AIOps, and conduct a proof-of-concept case study on a typical O&M task (failure diagnosis) for a real industrial TN. As the first systematic AI-driven O&M framework for TNs, TelOps opens a new door to applying AI techniques to TN automation.

* IEEE Communications Magazine ( Volume: 62, Issue: 4, April 2024)
* 7 pages, 4 figures, magazine

Via

Access Paper or Ask Questions

Review of Mathematical Optimization in Federated Learning

Dec 02, 2024

Shusen Yang, Fangyuan Zhao, Zihao Zhou, Liang Shi, Xuebin Ren, Zongben Xu

Abstract:Federated Learning (FL) has been becoming a popular interdisciplinary research area in both applied mathematics and information sciences. Mathematically, FL aims to collaboratively optimize aggregate objective functions over distributed datasets while satisfying a variety of privacy and system constraints.Different from conventional distributed optimization methods, FL needs to address several specific issues (e.g., non-i.i.d. data distributions and differential private noises), which pose a set of new challenges in the problem formulation, algorithm design, and convergence analysis. In this paper, we will systematically review existing FL optimization research including their assumptions, formulations, methods, and theoretical results. Potential future directions are also discussed.

* To appear in CSIAM Transactions on Applied Mathematics (CSIAM-AM)

Via

Access Paper or Ask Questions

Adversarial Reweighting with $α$-Power Maximization for Domain Adaptation

Apr 26, 2024

Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu

Abstract:The practical Domain Adaptation (DA) tasks, e.g., Partial DA (PDA), open-set DA, universal DA, and test-time adaptation, have gained increasing attention in the machine learning community. In this paper, we propose a novel approach, dubbed Adversarial Reweighting with $\alpha$-Power Maximization (ARPM), for PDA where the source domain contains private classes absent in target domain. In ARPM, we propose a novel adversarial reweighting model that adversarially learns to reweight source domain data to identify source-private class samples by assigning smaller weights to them, for mitigating potential negative transfer. Based on the adversarial reweighting, we train the transferable recognition model on the reweighted source distribution to be able to classify common class data. To reduce the prediction uncertainty of the recognition model on the target domain for PDA, we present an $\alpha$-power maximization mechanism in ARPM, which enriches the family of losses for reducing the prediction uncertainty for PDA. Extensive experimental results on five PDA benchmarks, i.e., Office-31, Office-Home, VisDA-2017, ImageNet-Caltech, and DomainNet, show that our method is superior to recent PDA methods. Ablation studies also confirm the effectiveness of components in our approach. To theoretically analyze our method, we deduce an upper bound of target domain expected error for PDA, which is approximately minimized in our approach. We further extend ARPM to open-set DA, universal DA, and test time adaptation, and verify the usefulness through experiments.

* To appear in IJCV

Via

Access Paper or Ask Questions

Learning-based Multi-continuum Model for Multiscale Flow Problems

Mar 21, 2024

Fan Wang, Yating Wang, Wing Tat Leung, Zongben Xu

Abstract:Multiscale problems can usually be approximated through numerical homogenization by an equation with some effective parameters that can capture the macroscopic behavior of the original system on the coarse grid to speed up the simulation. However, this approach usually assumes scale separation and that the heterogeneity of the solution can be approximated by the solution average in each coarse block. For complex multiscale problems, the computed single effective properties/continuum might be inadequate. In this paper, we propose a novel learning-based multi-continuum model to enrich the homogenized equation and improve the accuracy of the single continuum model for multiscale problems with some given data. Without loss of generalization, we consider a two-continuum case. The first flow equation keeps the information of the original homogenized equation with an additional interaction term. The second continuum is newly introduced, and the effective permeability in the second flow equation is determined by a neural network. The interaction term between the two continua aligns with that used in the Dual-porosity model but with a learnable coefficient determined by another neural network. The new model with neural network terms is then optimized using trusted data. We discuss both direct back-propagation and the adjoint method for the PDE-constraint optimization problem. Our proposed learning-based multi-continuum model can resolve multiple interacted media within each coarse grid block and describe the mass transfer among them, and it has been demonstrated to significantly improve the simulation results through numerical experiments involving both linear and nonlinear flow equations.

Via

Access Paper or Ask Questions

TRG-Net: An Interpretable and Controllable Rain Generator

Mar 15, 2024

Zhiqiang Pang, Hong Wang, Qi Xie, Deyu Meng, Zongben Xu

Abstract:Exploring and modeling rain generation mechanism is critical for augmenting paired data to ease training of rainy image processing models. Against this task, this study proposes a novel deep learning based rain generator, which fully takes the physical generation mechanism underlying rains into consideration and well encodes the learning of the fundamental rain factors (i.e., shape, orientation, length, width and sparsity) explicitly into the deep network. Its significance lies in that the generator not only elaborately design essential elements of the rain to simulate expected rains, like conventional artificial strategies, but also finely adapt to complicated and diverse practical rainy images, like deep learning methods. By rationally adopting filter parameterization technique, we first time achieve a deep network that is finely controllable with respect to rain factors and able to learn the distribution of these factors purely from data. Our unpaired generation experiments demonstrate that the rain generated by the proposed rain generator is not only of higher quality, but also more effective for deraining and downstream tasks compared to current state-of-the-art rain generation methods. Besides, the paired data augmentation experiments, including both in-distribution and out-of-distribution (OOD), further validate the diversity of samples generated by our model for in-distribution deraining and OOD generalization tasks.

Via

Access Paper or Ask Questions

Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

Dec 25, 2023

Jiahong Fu, Qi Xie, Deyu Meng, Zongben Xu

Figure 1 for Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

Figure 2 for Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

Figure 3 for Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

Figure 4 for Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

Abstract:The deep unfolding approach has attracted significant attention in computer vision tasks, which well connects conventional image processing modeling manners with more recent deep learning techniques. Specifically, by establishing a direct correspondence between algorithm operators at each implementation step and network modules within each layer, one can rationally construct an almost ``white box'' network architecture with high interpretability. In this architecture, only the predefined component of the proximal operator, known as a proximal network, needs manual configuration, enabling the network to automatically extract intrinsic image priors in a data-driven manner. In current deep unfolding methods, such a proximal network is generally designed as a CNN architecture, whose necessity has been proven by a recent theory. That is, CNN structure substantially delivers the translational invariant image prior, which is the most universally possessed structural prior across various types of images. However, standard CNN-based proximal networks have essential limitations in capturing the rotation symmetry prior, another universal structural prior underlying general images. This leaves a large room for further performance improvement in deep unfolding approaches. To address this issue, this study makes efforts to suggest a high-accuracy rotation equivariant proximal network that effectively embeds rotation symmetry priors into the deep unfolding framework. Especially, we deduce, for the first time, the theoretical equivariant error for such a designed proximal network with arbitrary layers under arbitrary rotation degrees. This analysis should be the most refined theoretical conclusion for such error evaluation to date and is also indispensable for supporting the rationale behind such networks with intrinsic interpretability requirements.

Via

Access Paper or Ask Questions

Optimal Transport-Guided Conditional Score-Based Diffusion Models

Nov 02, 2023

Xiang Gu, Liwei Yang, Jian Sun, Zongben Xu

Figure 1 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Figure 2 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Figure 3 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Figure 4 for Optimal Transport-Guided Conditional Score-Based Diffusion Models

Abstract:Conditional score-based diffusion model (SBDM) is for conditional generation of target data with paired data as condition, and has achieved great success in image translation. However, it requires the paired data as condition, and there would be insufficient paired data provided in real-world applications. To tackle the applications with partially paired or even unpaired dataset, we propose a novel Optimal Transport-guided Conditional Score-based diffusion model (OTCS) in this paper. We build the coupling relationship for the unpaired or partially paired dataset based on $L_2$-regularized unsupervised or semi-supervised optimal transport, respectively. Based on the coupling relationship, we develop the objective for training the conditional score-based model for unpaired or partially paired settings, which is based on a reformulation and generalization of the conditional SBDM for paired setting. With the estimated coupling relationship, we effectively train the conditional score-based model by designing a ``resampling-by-compatibility'' strategy to choose the sampled data with high compatibility as guidance. Extensive experiments on unpaired super-resolution and semi-paired image-to-image translation demonstrated the effectiveness of the proposed OTCS model. From the viewpoint of optimal transport, OTCS provides an approach to transport data across distributions, which is a challenge for OT on large-scale datasets. We theoretically prove that OTCS realizes the data transport in OT with a theoretical bound. Code is available at \url{https://github.com/XJTU-XGU/OTCS}.

* Accepted in NeurIPS 2023

Via

Access Paper or Ask Questions

Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Aug 31, 2023

Cuican Yu, Guansong Lu, Yihan Zeng, Jian Sun, Xiaodan Liang, Huibin Li, Zongben Xu, Songcen Xu, Wei Zhang, Hang Xu

Figure 1 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Figure 2 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Figure 3 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Figure 4 for Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Abstract:Generating 3D faces from textual descriptions has a multitude of applications, such as gaming, movie, and robotics. Recent progresses have demonstrated the success of unconditional 3D face generation and text-to-3D shape generation. However, due to the limited text-3D face data pairs, text-driven 3D face generation remains an open problem. In this paper, we propose a text-guided 3D faces generation method, refer as TG-3DFace, for generating realistic 3D faces using text guidance. Specifically, we adopt an unconditional 3D face generation framework and equip it with text conditions, which learns the text-guided 3D face generation with only text-2D face data. On top of that, we propose two text-to-face cross-modal alignment techniques, including the global contrastive learning and the fine-grained alignment module, to facilitate high semantic consistency between generated 3D faces and input texts. Besides, we present directional classifier guidance during the inference process, which encourages creativity for out-of-domain generations. Compared to the existing methods, TG-3DFace creates more realistic and aesthetically pleasing 3D faces, boosting 9% multi-view consistency (MVIC) over Latent3D. The rendered face images generated by TG-3DFace achieve higher FID and CLIP score than text-to-2D face/image generation models, demonstrating our superiority in generating realistic and semantic-consistent textures.

* accepted by ICCV 2023

Via

Access Paper or Ask Questions

Multichannel Frequency Estimation in Challenging Scenarios via Structured Matrix Embedding and Recovery (StruMER)

Jul 15, 2023

Xunmeng Wu, Zai Yang, Zongben Xu

Abstract:Multichannel frequency estimation with incomplete data and miscellaneous noises arises in array signal processing, modal analysis, wireless communications, and so on. In this paper, we consider maximum-likelihood(-like) optimization methods for frequency estimation in which proper objective functions are adopted subject to observed data patterns and noise types. We propose a universal signal-domain approach to solve the optimization problems by embedding the noiseless multichannel signal of interest into a series of low-rank positive-semidefinite block matrices of Hankel and Toeplitz submatrices and formulating the original parameter-domain optimization problems as equivalent structured matrix recovery problems. The alternating direction method of multipliers (ADMM) is applied to solve the resulting matrix recovery problems in which both subproblems of ADMM are solved in (nearly) closed form. The proposed approach is termed as structured matrix embedding and recovery (StruMER). Extensive numerical simulations are provided to demonstrate that StruMER has improved threshold performances in various challenging scenarios, e.g., limited data, low signal-to-noise ratio, impulsive noise, and closely spaced frequencies, as compared with state-of-the-art methods.

Via

Access Paper or Ask Questions