Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianyou Chai

CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

May 02, 2025

Zhe Zhang, Mingxiu Cai, Hanxiao Wang, Gaochang Wu, Tianyou Chai, Xiatian Zhu

Figure 1 for CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

Figure 2 for CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

Figure 3 for CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

Figure 4 for CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

Abstract:Unsupervised anomaly detection (UAD) seeks to localize the anomaly mask of an input image with respect to normal samples. Either by reconstructing normal counterparts (reconstruction-based) or by learning an image feature embedding space (embedding-based), existing approaches fundamentally rely on image-level or feature-level matching to derive anomaly scores. Often, such a matching process is inaccurate yet overlooked, leading to sub-optimal detection. To address this issue, we introduce the concept of cost filtering, borrowed from classical matching tasks, such as depth and flow estimation, into the UAD problem. We call this approach {\em CostFilter-AD}. Specifically, we first construct a matching cost volume between the input and normal samples, comprising two spatial dimensions and one matching dimension that encodes potential matches. To refine this, we propose a cost volume filtering network, guided by the input observation as an attention query across multiple feature layers, which effectively suppresses matching noise while preserving edge structures and capturing subtle anomalies. Designed as a generic post-processing plug-in, CostFilter-AD can be integrated with either reconstruction-based or embedding-based methods. Extensive experiments on MVTec-AD and VisA benchmarks validate the generic benefits of CostFilter-AD for both single- and multi-class UAD tasks. Code and models will be released at https://github.com/ZHE-SAPI/CostFilter-AD.

* 20 pages, 11 figures, 10 tables, accepted by Forty-Second International Conference on Machine Learning ( ICML 2025 )

Via

Access Paper or Ask Questions

One-Point Sampling for Distributed Bandit Convex Optimization with Time-Varying Constraints

Apr 24, 2025

Kunpeng Zhang, Lei Xu, Xinlei Yi, Guanghui Wen, Lihua Xie, Tianyou Chai, Tao Yang

Abstract:This paper considers the distributed bandit convex optimization problem with time-varying constraints. In this problem, the global loss function is the average of all the local convex loss functions, which are unknown beforehand. Each agent iteratively makes its own decision subject to time-varying inequality constraints which can be violated but are fulfilled in the long run. For a uniformly jointly strongly connected time-varying directed graph, a distributed bandit online primal-dual projection algorithm with one-point sampling is proposed. We show that sublinear dynamic network regret and network cumulative constraint violation are achieved if the path-length of the benchmark also increases in a sublinear manner. In addition, an $\mathcal{O}({T^{3/4 + g}})$ static network regret bound and an $\mathcal{O}( {{T^{1 - {g}/2}}} )$ network cumulative constraint violation bound are established, where $T$ is the total number of iterations and $g \in ( {0,1/4} )$ is a trade-off parameter. Moreover, a reduced static network regret bound $\mathcal{O}( {{T^{2/3 + 4g /3}}} )$ is established for strongly convex local loss functions. Finally, a numerical example is presented to validate the theoretical results.

* 15 pages, 3 figures

Via

Access Paper or Ask Questions

Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting Process: Methodology and Benchmark

Jun 13, 2024

Gaochang Wu, Yapeng Zhang, Lan Deng, Jingxin Zhang, Tianyou Chai

Abstract:Fused Magnesium Furnace (FMF) is a crucial industrial equipment in the production of magnesia, and anomaly detection plays a pivotal role in ensuring its efficient, stable, and secure operation. Existing anomaly detection methods primarily focus on analyzing dominant anomalies using the process variables (such as arc current) or constructing neural networks based on abnormal visual features, while overlooking the intrinsic correlation of cross-modal information. This paper proposes a cross-modal Transformer (dubbed FmFormer), designed to facilitate anomaly detection in fused magnesium smelting processes by exploring the correlation between visual features (video) and process variables (current). Our approach introduces a novel tokenization paradigm to effectively bridge the substantial dimensionality gap between the 3D video modality and the 1D current modality in a multiscale manner, enabling a hierarchical reconstruction of pixel-level anomaly detection. Subsequently, the FmFormer leverages self-attention to learn internal features within each modality and bidirectional cross-attention to capture correlations across modalities. To validate the effectiveness of the proposed method, we also present a pioneering cross-modal benchmark of the fused magnesium smelting process, featuring synchronously acquired video and current data for over 2.2 million samples. Leveraging cross-modal learning, the proposed FmFormer achieves state-of-the-art performance in detecting anomalies, particularly under extreme interferences such as current fluctuations and visual occlusion caused by heavy water mist. The presented methodology and benchmark may be applicable to other industrial applications with some amendments. The benchmark will be released at https://github.com/GaochangWu/FMF-Benchmark.

* 14 pages, 6 figures, 5 tables. Submitted to IEEE

Via

Access Paper or Ask Questions

DA-STC: Domain Adaptive Video Semantic Segmentation via Spatio-Temporal Consistency

Nov 22, 2023

Zhe Zhang, Gaochang Wu, Jing Zhang, Chunhua Shen, Dacheng Tao, Tianyou Chai

Figure 1 for DA-STC: Domain Adaptive Video Semantic Segmentation via Spatio-Temporal Consistency

Figure 2 for DA-STC: Domain Adaptive Video Semantic Segmentation via Spatio-Temporal Consistency

Figure 3 for DA-STC: Domain Adaptive Video Semantic Segmentation via Spatio-Temporal Consistency

Figure 4 for DA-STC: Domain Adaptive Video Semantic Segmentation via Spatio-Temporal Consistency

Abstract:Video semantic segmentation is a pivotal aspect of video representation learning. However, significant domain shifts present a challenge in effectively learning invariant spatio-temporal features across the labeled source domain and unlabeled target domain for video semantic segmentation. To solve the challenge, we propose a novel DA-STC method for domain adaptive video semantic segmentation, which incorporates a bidirectional multi-level spatio-temporal fusion module and a category-aware spatio-temporal feature alignment module to facilitate consistent learning for domain-invariant features. Firstly, we perform bidirectional spatio-temporal fusion at the image sequence level and shallow feature level, leading to the construction of two fused intermediate video domains. This prompts the video semantic segmentation model to consistently learn spatio-temporal features of shared patch sequences which are influenced by domain-specific contexts, thereby mitigating the feature gap between the source and target domain. Secondly, we propose a category-aware feature alignment module to promote the consistency of spatio-temporal features, facilitating adaptation to the target domain. Specifically, we adaptively aggregate the domain-specific deep features of each category along spatio-temporal dimensions, which are further constrained to achieve cross-domain intra-class feature alignment and inter-class feature separation. Extensive experiments demonstrate the effectiveness of our method, which achieves state-of-the-art mIOUs on multiple challenging benchmarks. Furthermore, we extend the proposed DA-STC to the image domain, where it also exhibits superior performance for domain adaptive semantic segmentation. The source code and models will be made available at \url{https://github.com/ZHE-SAPI/DA-STC}.

* 18 pages,9 figures

Via

Access Paper or Ask Questions

Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition

May 31, 2023

Xinlei Yi, Xiuxian Li, Tao Yang, Lihua Xie, Yiguang Hong, Tianyou Chai, Karl H. Johansson

Figure 1 for Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition

Figure 2 for Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition

Figure 3 for Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition

Figure 4 for Distributed Online Convex Optimization with Adversarial Constraints: Reduced Cumulative Constraint Violation Bounds under Slater's Condition

Abstract:This paper considers distributed online convex optimization with adversarial constraints. In this setting, a network of agents makes decisions at each round, and then only a portion of the loss function and a coordinate block of the constraint function are privately revealed to each agent. The loss and constraint functions are convex and can vary arbitrarily across rounds. The agents collaborate to minimize network regret and cumulative constraint violation. A novel distributed online algorithm is proposed and it achieves an $\mathcal{O}(T^{\max\{c,1-c\}})$ network regret bound and an $\mathcal{O}(T^{1-c/2})$ network cumulative constraint violation bound, where $T$ is the number of rounds and $c\in(0,1)$ is a user-defined trade-off parameter. When Slater's condition holds (i.e, there is a point that strictly satisfies the inequality constraints), the network cumulative constraint violation bound is reduced to $\mathcal{O}(T^{1-c})$. Moreover, if the loss functions are strongly convex, then the network regret bound is reduced to $\mathcal{O}(\log(T))$, and the network cumulative constraint violation bound is reduced to $\mathcal{O}(\sqrt{\log(T)T})$ and $\mathcal{O}(\log(T))$ without and with Slater's condition, respectively. To the best of our knowledge, this paper is the first to achieve reduced (network) cumulative constraint violation bounds for (distributed) online convex optimization with adversarial constraints under Slater's condition. Finally, the theoretical results are verified through numerical simulations.

Via

Access Paper or Ask Questions

Data-Driven Inverse Reinforcement Learning for Expert-Learner Zero-Sum Games

Jan 05, 2023

Wenqian Xue, Bosen Lian, Jialu Fan, Tianyou Chai, Frank L. Lewis

Figure 1 for Data-Driven Inverse Reinforcement Learning for Expert-Learner Zero-Sum Games

Figure 2 for Data-Driven Inverse Reinforcement Learning for Expert-Learner Zero-Sum Games

Figure 3 for Data-Driven Inverse Reinforcement Learning for Expert-Learner Zero-Sum Games

Figure 4 for Data-Driven Inverse Reinforcement Learning for Expert-Learner Zero-Sum Games

Abstract:In this paper, we formulate inverse reinforcement learning (IRL) as an expert-learner interaction whereby the optimal performance intent of an expert or target agent is unknown to a learner agent. The learner observes the states and controls of the expert and hence seeks to reconstruct the expert's cost function intent and thus mimics the expert's optimal response. Next, we add non-cooperative disturbances that seek to disrupt the learning and stability of the learner agent. This leads to the formulation of a new interaction we call zero-sum game IRL. We develop a framework to solve the zero-sum game IRL problem that is a modified extension of RL policy iteration (PI) to allow unknown expert performance intentions to be computed and non-cooperative disturbances to be rejected. The framework has two parts: a value function and control action update based on an extension of PI, and a cost function update based on standard inverse optimal control. Then, we eventually develop an off-policy IRL algorithm that does not require knowledge of the expert and learner agent dynamics and performs single-loop learning. Rigorous proofs and analyses are given. Finally, simulation experiments are presented to show the effectiveness of the new approach.

* 9 pages, 3 figures

Via

Access Paper or Ask Questions

Geo-NI: Geometry-aware Neural Interpolation for Light Field Rendering

Jun 20, 2022

Gaochang Wu, Yuemei Zhou, Yebin Liu, Lu Fang, Tianyou Chai

Figure 1 for Geo-NI: Geometry-aware Neural Interpolation for Light Field Rendering

Figure 2 for Geo-NI: Geometry-aware Neural Interpolation for Light Field Rendering

Figure 3 for Geo-NI: Geometry-aware Neural Interpolation for Light Field Rendering

Figure 4 for Geo-NI: Geometry-aware Neural Interpolation for Light Field Rendering

Abstract:In this paper, we present a Geometry-aware Neural Interpolation (Geo-NI) framework for light field rendering. Previous learning-based approaches either rely on the capability of neural networks to perform direct interpolation, which we dubbed Neural Interpolation (NI), or explore scene geometry for novel view synthesis, also known as Depth Image-Based Rendering (DIBR). Instead, we incorporate the ideas behind these two kinds of approaches by launching the NI with a novel DIBR pipeline. Specifically, the proposed Geo-NI first performs NI using input light field sheared by a set of depth hypotheses. Then the DIBR is implemented by assigning the sheared light fields with a novel reconstruction cost volume according to the reconstruction quality under different depth hypotheses. The reconstruction cost is interpreted as a blending weight to render the final output light field by blending the reconstructed light fields along the dimension of depth hypothesis. By combining the superiorities of NI and DIBR, the proposed Geo-NI is able to render views with large disparity with the help of scene geometry while also reconstruct non-Lambertian effect when depth is prone to be ambiguous. Extensive experiments on various datasets demonstrate the superior performance of the proposed geometry-aware light field rendering framework.

* 13 pages, 8 figures, 4 tables

Via

Access Paper or Ask Questions

Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints

Jun 09, 2021

Xinlei Yi, Xiuxian Li, Tao Yang, Lihua Xie, Tianyou Chai, Karl H. Johansson

Figure 1 for Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints

Figure 2 for Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints

Figure 3 for Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints

Figure 4 for Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints

Abstract:This paper considers online convex optimization with long term constraints, where constraints can be violated in intermediate rounds, but need to be satisfied in the long run. The cumulative constraint violation is used as the metric to measure constraint violations, which excludes the situation that strictly feasible constraints can compensate the effects of violated constraints. A novel algorithm is first proposed and it achieves an $\mathcal{O}(T^{\max\{c,1-c\}})$ bound for static regret and an $\mathcal{O}(T^{(1-c)/2})$ bound for cumulative constraint violation, where $c\in(0,1)$ is a user-defined trade-off parameter, and thus has improved performance compared with existing results. Both static regret and cumulative constraint violation bounds are reduced to $\mathcal{O}(\log(T))$ when the loss functions are strongly convex, which also improves existing results. %In order to bound the regret with respect to any comparator sequence, In order to achieve the optimal regret with respect to any comparator sequence, another algorithm is then proposed and it achieves the optimal $\mathcal{O}(\sqrt{T(1+P_T)})$ regret and an $\mathcal{O}(\sqrt{T})$ cumulative constraint violation, where $P_T$ is the path-length of the comparator sequence. Finally, numerical simulations are provided to illustrate the effectiveness of the theoretical results.

Via

Access Paper or Ask Questions

Regret and Cumulative Constraint Violation Analysis for Distributed Online Constrained Convex Optimization

May 01, 2021

Xinlei Yi, Xiuxian Li, Tao Yang, Lihua Xie, Tianyou Chai, Karl H. Johansson

Figure 1 for Regret and Cumulative Constraint Violation Analysis for Distributed Online Constrained Convex Optimization

Figure 2 for Regret and Cumulative Constraint Violation Analysis for Distributed Online Constrained Convex Optimization

Figure 3 for Regret and Cumulative Constraint Violation Analysis for Distributed Online Constrained Convex Optimization

Abstract:This paper considers the distributed online convex optimization problem with time-varying constraints over a network of agents. This is a sequential decision making problem with two sequences of arbitrarily varying convex loss and constraint functions. At each round, each agent selects a decision from the decision set, and then only a portion of the loss function and a coordinate block of the constraint function at this round are privately revealed to this agent. The goal of the network is to minimize network regret and constraint violation. Two distributed online algorithms with full-information and bandit feedback are proposed. Both dynamic and static network regret bounds are analyzed for the proposed algorithms, and network cumulative constraint violation is used to measure constraint violation, which excludes the situation that strictly feasible constraints can compensate the effects of violated constraints. In particular, we show that the proposed algorithms achieve $\mathcal{O}(T^{\max\{\kappa,1-\kappa\}})$ static network regret and $\mathcal{O}(T^{1-\kappa/2})$ network cumulative constraint violation, where $T$ is the total number of rounds and $\kappa\in(0,1)$ is a user-defined trade-off parameter. Moreover, if the loss functions are strongly convex, then the static network regret bound can be reduced to $\mathcal{O}(T^{\kappa})$. Finally, numerical simulations are provided to illustrate the effectiveness of the theoretical results.

Via

Access Paper or Ask Questions

Revisiting Light Field Rendering with Deep Anti-Aliasing Neural Network

Apr 28, 2021

Gaochang Wu, Yebin Liu, Lu Fang, Tianyou Chai

Figure 1 for Revisiting Light Field Rendering with Deep Anti-Aliasing Neural Network

Figure 2 for Revisiting Light Field Rendering with Deep Anti-Aliasing Neural Network

Figure 3 for Revisiting Light Field Rendering with Deep Anti-Aliasing Neural Network

Figure 4 for Revisiting Light Field Rendering with Deep Anti-Aliasing Neural Network

Abstract:The light field (LF) reconstruction is mainly confronted with two challenges, large disparity and the non-Lambertian effect. Typical approaches either address the large disparity challenge using depth estimation followed by view synthesis or eschew explicit depth information to enable non-Lambertian rendering, but rarely solve both challenges in a unified framework. In this paper, we revisit the classic LF rendering framework to address both challenges by incorporating it with advanced deep learning techniques. First, we analytically show that the essential issue behind the large disparity and non-Lambertian challenges is the aliasing problem. Classic LF rendering approaches typically mitigate the aliasing with a reconstruction filter in the Fourier domain, which is, however, intractable to implement within a deep learning pipeline. Instead, we introduce an alternative framework to perform anti-aliasing reconstruction in the image domain and analytically show comparable efficacy on the aliasing issue. To explore the full potential, we then embed the anti-aliasing framework into a deep neural network through the design of an integrated architecture and trainable parameters. The network is trained through end-to-end optimization using a peculiar training set, including regular LFs and unstructured LFs. The proposed deep learning pipeline shows a substantial superiority in solving both the large disparity and the non-Lambertian challenges compared with other state-of-the-art approaches. In addition to the view interpolation for an LF, we also show that the proposed pipeline also benefits light field view extrapolation.

* IEEE TPAMI, 2021
* 15 pages, 12 figures. Accepted by IEEE TPAMI

Via

Access Paper or Ask Questions