Abstract:In this work, we address the limitations of denoising diffusion models (DDMs) in image restoration tasks, particularly the shape and color distortions that can compromise image quality. While DDMs have demonstrated a promising performance in many applications such as text-to-image synthesis, their effectiveness in image restoration is often hindered by shape and color distortions. We observe that these issues arise from inconsistencies between the training and testing data used by DDMs. Based on our observation, we propose a novel training method, named data-consistent training, which allows the DDMs to access images with accumulated errors during training, thereby ensuring the model to learn to correct these errors. Experimental results show that, across five image restoration tasks, our method has significant improvements over state-of-the-art methods while effectively minimizing distortions and preserving image fidelity.
Abstract:Federated learning is a distributed learning framework which enables clients to train models individually and to upload their model updates for aggregation. The local training process heavily relies on distributed gradient descent techniques. In the situation where gradient information is not available, the gradients need to be estimated from zeroth-order information, which typically involves computing finite-differences along isotropic random directions. This method suffers from high estimation errors, as the geometric features of the objective landscape may be overlooked during the isotropic sampling. In this work, we propose a non-isotropic sampling method to improve the gradient estimation procedure. Gradients in our method are estimated in a subspace spanned by historical trajectories of solutions, aiming to encourage the exploration of promising regions and hence improve the convergence. We implement this method in zeroth-order federated settings, and show that the convergence rate aligns with existing ones while introducing no significant overheads in communication or local computation. The effectiveness of our proposal is verified on several numerical experiments in comparison to several commonly-used zeroth-order federated optimization algorithms.
Abstract:Machine Translation of Culture-Specific Items (CSIs) poses significant challenges. Recent work on CSI translation has shown some success using Large Language Models (LLMs) to adapt to different languages and cultures; however, a deeper analysis is needed to examine the benefits and pitfalls of each method. In this paper, we introduce the ChineseMenuCSI dataset, the largest for Chinese-English menu corpora, annotated with CSI vs Non-CSI labels and a fine-grained test set. We define three levels of CSI figurativeness for a more nuanced analysis and develop a novel methodology for automatic CSI identification, which outperforms GPT-based prompts in most categories. Importantly, we are the first to integrate human translation theories into LLM-driven translation processes, significantly improving translation accuracy, with COMET scores increasing by up to 7 points.
Abstract:The global positioning system (GPS) has become an indispensable navigation method for field operations with unmanned surface vehicles (USVs) in marine environments. However, GPS may not always be available outdoors because it is vulnerable to natural interference and malicious jamming attacks. Thus, an alternative navigation system is required when the use of GPS is restricted or prohibited. To this end, we present a novel method that utilizes an Unmanned Aerial Vehicle (UAV) to assist in localizing USVs in GNSS-restricted marine environments. In our approach, the UAV flies along the shoreline at a consistent altitude, continuously tracking and detecting the USV using a deep learning-based approach on camera images. Subsequently, triangulation techniques are applied to estimate the USV's position relative to the UAV, utilizing geometric information and datalink range from the UAV. We propose adjusting the UAV's camera angle based on the pixel error between the USV and the image center throughout the localization process to enhance accuracy. Additionally, visual measurements are integrated into an Extended Kalman Filter (EKF) for robust state estimation. To validate our proposed method, we utilize a USV equipped with onboard sensors and a UAV equipped with a camera. A heterogeneous robotic interface is established to facilitate communication between the USV and UAV. We demonstrate the efficacy of our approach through a series of experiments conducted during the ``Muhammad Bin Zayed International Robotic Challenge (MBZIRC-2024)'' in real marine environments, incorporating noisy measurements and ocean disturbances. The successful outcomes indicate the potential of our method to complement GPS for USV navigation.
Abstract:We present a benchmark for methods in causal learning. Specifically, we consider training a rich class of causal models from time-series data, and we suggest the use of the Krebs cycle and models of metabolism more broadly.
Abstract:Clustering of time series is a well-studied problem, with applications ranging from quantitative, personalized models of metabolism obtained from metabolite concentrations to state discrimination in quantum information theory. We consider a variant, where given a set of trajectories and a number of parts, we jointly partition the set of trajectories and learn linear dynamical system (LDS) models for each part, so as to minimize the maximum error across all the models. We present globally convergent methods and EM heuristics, accompanied by promising computational results.
Abstract:This work concerns the evolutionary approaches to distributed stochastic black-box optimization, in which each worker can individually solve an approximation of the problem with nature-inspired algorithms. We propose a distributed evolution strategy (DES) algorithm grounded on a proper modification to evolution strategies, a family of classic evolutionary algorithms, as well as a careful combination with existing distributed frameworks. On smooth and nonconvex landscapes, DES has a convergence rate competitive to existing zeroth-order methods, and can exploit the sparsity, if applicable, to match the rate of first-order methods. The DES method uses a Gaussian probability model to guide the search and avoids the numerical issue resulted from finite-difference techniques in existing zeroth-order methods. The DES method is also fully adaptive to the problem landscape, as its convergence is guaranteed with any parameter setting. We further propose two alternative sampling schemes which significantly improve the sampling efficiency while leading to similar performance. Simulation studies on several machine learning problems suggest that the proposed methods show much promise in reducing the convergence time and improving the robustness to parameter settings.
Abstract:This work provides an efficient sampling method for the covariance matrix adaptation evolution strategy (CMA-ES) in large-scale settings. In contract to the Gaussian sampling in CMA-ES, the proposed method generates mutation vectors from a mixture model, which facilitates exploiting the rich variable correlations of the problem landscape within a limited time budget. We analyze the probability distribution of this mixture model and show that it approximates the Gaussian distribution of CMA-ES with a controllable accuracy. We use this sampling method, coupled with a novel method for mutation strength adaptation, to formulate the mixture model based evolution strategy (MMES) -- a CMA-ES variant for large-scale optimization. The numerical simulations show that, while significantly reducing the time complexity of CMA-ES, MMES preserves the rotational invariance, is scalable to high dimensional problems, and is competitive against the state-of-the-arts in performing global optimization.
Abstract:Federated learning allows multiple participants to collaboratively train an efficient model without exposing data privacy. However, this distributed machine learning training method is prone to attacks from Byzantine clients, which interfere with the training of the global model by modifying the model or uploading the false gradient. In this paper, we propose a novel serverless federated learning framework Committee Mechanism based Federated Learning (CMFL), which can ensure the robustness of the algorithm with convergence guarantee. In CMFL, a committee system is set up to screen the uploaded local gradients. The committee system selects the local gradients rated by the elected members for the aggregation procedure through the selection strategy, and replaces the committee member through the election strategy. Based on the different considerations of model performance and defense, two opposite selection strategies are designed for the sake of both accuracy and robustness. Extensive experiments illustrate that CMFL achieves faster convergence and better accuracy than the typical Federated Learning, in the meanwhile obtaining better robustness than the traditional Byzantine-tolerant algorithms, in the manner of a decentralized approach. In addition, we theoretically analyze and prove the convergence of CMFL under different election and selection strategies, which coincides with the experimental results.
Abstract:Unwanted nonlinear gamma distortion frequently occurs in a great diversity of images during the procedures of image acquisition, processing, and/or display. And the gamma distortion often varies with capture setup change and luminance variation. Blind inverse gamma correction, which automatically determines a proper restoration gamma value from a given image, is of paramount importance to attenuate the distortion. For blind inverse gamma correction, an adaptive gamma transformation method (AGT-ME) is proposed directly from a maximized differential entropy model. And the corresponding optimization has a mathematical concise closed-form solution, resulting in efficient implementation and accurate gamma restoration of AGT-ME. Considering the human eye has a non-linear perception sensitivity, a modified version AGT-ME-VISUAL is also proposed to achieve better visual performance. Tested on variable datasets, AGT-ME could obtain an accurate estimation of a large range of gamma distortion (0.1 to 3.0), outperforming the state-of-the-art methods. Besides, the proposed AGT-ME and AGT-ME-VISUAL were applied to three typical applications, including automatic gamma adjustment, natural/medical image contrast enhancement, and fringe projection profilometry image restoration. Furthermore, the AGT-ME/ AGT-ME-VISUAL is general and can be seamlessly extended to the masked image, multi-channel (color or spectrum) image, or multi-frame video, and free of the arbitrary tuning parameter. Besides, the corresponding Python code (https://github.com/yongleex/AGT-ME) is also provided for interested users.