Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kaixuan Zhang

High Dynamic Range Novel View Synthesis with Single Exposure

May 02, 2025

Kaixuan Zhang, Hu Wang, Minxian Li, Mingwu Ren, Mao Ye, Xiatian Zhu

Abstract:High Dynamic Range Novel View Synthesis (HDR-NVS) aims to establish a 3D scene HDR model from Low Dynamic Range (LDR) imagery. Typically, multiple-exposure LDR images are employed to capture a wider range of brightness levels in a scene, as a single LDR image cannot represent both the brightest and darkest regions simultaneously. While effective, this multiple-exposure HDR-NVS approach has significant limitations, including susceptibility to motion artifacts (e.g., ghosting and blurring), high capture and storage costs. To overcome these challenges, we introduce, for the first time, the single-exposure HDR-NVS problem, where only single exposure LDR images are available during training. We further introduce a novel approach, Mono-HDR-3D, featuring two dedicated modules formulated by the LDR image formation principles, one for converting LDR colors to HDR counterparts, and the other for transforming HDR images to LDR format so that unsupervised learning is enabled in a closed loop. Designed as a meta-algorithm, our approach can be seamlessly integrated with existing NVS models. Extensive experiments show that Mono-HDR-3D significantly outperforms previous methods. Source code will be released.

* It has been accepted by ICML 2025

Via

Access Paper or Ask Questions

Operator Learning with Domain Decomposition for Geometry Generalization in PDE Solving

Apr 01, 2025

Jianing Huang, Kaixuan Zhang, Youjia Wu, Ze Cheng

Abstract:Neural operators have become increasingly popular in solving \textit{partial differential equations} (PDEs) due to their superior capability to capture intricate mappings between function spaces over complex domains. However, the data-hungry nature of operator learning inevitably poses a bottleneck for their widespread applications. At the core of the challenge lies the absence of transferability of neural operators to new geometries. To tackle this issue, we propose operator learning with domain decomposition, a local-to-global framework to solve PDEs on arbitrary geometries. Under this framework, we devise an iterative scheme \textit{Schwarz Neural Inference} (SNI). This scheme allows for partitioning of the problem domain into smaller subdomains, on which local problems can be solved with neural operators, and stitching local solutions to construct a global solution. Additionally, we provide a theoretical analysis of the convergence rate and error bound. We conduct extensive experiments on several representative PDEs with diverse boundary conditions and achieve remarkable geometry generalization compared to alternative methods. These analysis and experiments demonstrate the proposed framework's potential in addressing challenges related to geometry generalization and data efficiency.

Via

Access Paper or Ask Questions

Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

Jun 21, 2023

Boyan Li, Luziwei Leng, Ran Cheng, Shuaijie Shen, Kaixuan Zhang, Jianguo Zhang, Jianxing Liao

Figure 1 for Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

Figure 2 for Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

Figure 3 for Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

Figure 4 for Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference

Abstract:Advancements in adapting deep convolution architectures for Spiking Neural Networks (SNNs) have significantly enhanced image classification performance and reduced computational burdens. However, the inability of Multiplication-Free Inference (MFI) to harmonize with attention and transformer mechanisms, which are critical to superior performance on high-resolution vision tasks, imposes limitations on these gains. To address this, our research explores a new pathway, drawing inspiration from the progress made in Multi-Layer Perceptrons (MLPs). We propose an innovative spiking MLP architecture that uses batch normalization to retain MFI compatibility and introduces a spiking patch encoding layer to reinforce local feature extraction capabilities. As a result, we establish an efficient multi-stage spiking MLP network that effectively blends global receptive fields with local feature extraction for comprehensive spike-based computation. Without relying on pre-training or sophisticated SNN training techniques, our network secures a top-1 accuracy of 66.39% on the ImageNet-1K dataset, surpassing the directly trained spiking ResNet-34 by 2.67%. Furthermore, we curtail computational costs, model capacity, and simulation steps. An expanded version of our network challenges the performance of the spiking VGG-16 network with a 71.64% top-1 accuracy, all while operating with a model capacity 2.1 times smaller. Our findings accentuate the potential of our deep SNN architecture in seamlessly integrating global and local learning abilities. Interestingly, the trained receptive field in our network mirrors the activity patterns of cortical cells.

* 11 pages, 6 figures

Via

Access Paper or Ask Questions

A Large Dataset of Historical Japanese Documents with Complex Layouts

Apr 18, 2020

Zejiang Shen, Kaixuan Zhang, Melissa Dell

Figure 1 for A Large Dataset of Historical Japanese Documents with Complex Layouts

Figure 2 for A Large Dataset of Historical Japanese Documents with Complex Layouts

Figure 3 for A Large Dataset of Historical Japanese Documents with Complex Layouts

Figure 4 for A Large Dataset of Historical Japanese Documents with Complex Layouts

Abstract:Deep learning-based approaches for automatic document layout analysis and content extraction have the potential to unlock rich information trapped in historical documents on a large scale. One major hurdle is the lack of large datasets for training robust models. In particular, little training data exist for Asian languages. To this end, we present HJDataset, a Large Dataset of Historical Japanese Documents with Complex Layouts. It contains over 250,000 layout element annotations of seven types. In addition to bounding boxes and masks of the content regions, it also includes the hierarchical structures and reading orders for layout elements. The dataset is constructed using a combination of human and machine efforts. A semi-rule based method is developed to extract the layout elements, and the results are checked by human inspectors. The resulting large-scale dataset is used to provide baseline performance analyses for text region detection using state-of-the-art deep learning models. And we demonstrate the usefulness of the dataset on real-world document digitization tasks. The dataset is available at https://dell-research-harvard.github.io/HJDataset/.

* 8 pages, 8 figures, accepted at CVPR2020 Workshop on Text and Documents in the Deep Learning Era

Via

Access Paper or Ask Questions

Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata

Nov 12, 2019

Qinglong Wang, Kaixuan Zhang, Xue Liu, C. Lee Giles

Figure 1 for Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata

Figure 2 for Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata

Figure 3 for Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata

Figure 4 for Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata

Abstract:We propose an approach that connects recurrent networks with different orders of hidden interaction with regular grammars of different levels of complexity. We argue that the correspondence between recurrent networks and formal computational models gives understanding to the analysis of the complicated behaviors of recurrent networks. We introduce an entropy value that categorizes all regular grammars into three classes with different levels of complexity, and show that several existing recurrent networks match grammars from either all or partial classes. As such, the differences between regular grammars reveal the different properties of these models. We also provide a unification of all investigated recurrent networks. Our evaluation shows that the unified recurrent network has improved performance in learning grammars, and demonstrates comparable performance on a real-world dataset with more complicated models.

Via

Access Paper or Ask Questions

Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Oct 15, 2019

Kaixuan Zhang, Qinglong Wang, Xue Liu, C. Lee Giles

Figure 1 for Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Figure 2 for Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Figure 3 for Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Figure 4 for Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Abstract:Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley Homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. By interpreting the influence as a probability measure, we further define an entropy which reflects the complexity of the data manifold. Our empirical studies show that when using the 0-dimensional homology, on neighboring graphs, samples with higher influence scores have more impact on the accuracy of neural networks for determining the graph connectivity and on several regular grammars whose higher entropy values imply more difficulty in being learned.

Via

Access Paper or Ask Questions

Verification of Recurrent Neural Networks Through Rule Extraction

Nov 14, 2018

Qinglong Wang, Kaixuan Zhang, Xue Liu, C. Lee Giles

Figure 1 for Verification of Recurrent Neural Networks Through Rule Extraction

Figure 2 for Verification of Recurrent Neural Networks Through Rule Extraction

Figure 3 for Verification of Recurrent Neural Networks Through Rule Extraction

Figure 4 for Verification of Recurrent Neural Networks Through Rule Extraction

Abstract:The verification problem for neural networks is verifying whether a neural network will suffer from adversarial samples, or approximating the maximal allowed scale of adversarial perturbation that can be endured. While most prior work contributes to verifying feed-forward networks, little has been explored for verifying recurrent networks. This is due to the existence of a more rigorous constraint on the perturbation space for sequential data, and the lack of a proper metric for measuring the perturbation. In this work, we address these challenges by proposing a metric which measures the distance between strings, and use deterministic finite automata (DFA) to represent a rigorous oracle which examines if the generated adversarial samples violate certain constraints on a perturbation. More specifically, we empirically show that certain recurrent networks allow relatively stable DFA extraction. As such, DFAs extracted from these recurrent networks can serve as a surrogate oracle for when the ground truth DFA is unknown. We apply our verification mechanism to several widely used recurrent networks on a set of the Tomita grammars. The results demonstrate that only a few models remain robust against adversarial samples. In addition, we show that for grammars with different levels of complexity, there is also a difference in the difficulty of robust learning of these grammars.

Via

Access Paper or Ask Questions

A Comparison of Rule Extraction for Different Recurrent Neural Network Models and Grammatical Complexity

Jan 16, 2018

Qinglong Wang, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles

Figure 1 for A Comparison of Rule Extraction for Different Recurrent Neural Network Models and Grammatical Complexity

Figure 2 for A Comparison of Rule Extraction for Different Recurrent Neural Network Models and Grammatical Complexity

Figure 3 for A Comparison of Rule Extraction for Different Recurrent Neural Network Models and Grammatical Complexity

Figure 4 for A Comparison of Rule Extraction for Different Recurrent Neural Network Models and Grammatical Complexity

Abstract:It has been shown that rules can be extracted from highly non-linear, recursive models such as recurrent neural networks (RNNs). The RNN models mostly investigated include both Elman networks and second-order recurrent networks. Recently, new types of RNNs have demonstrated superior power in handling many machine learning tasks, especially when structural data is involved such as language modeling. Here, we empirically evaluate different recurrent models on the task of learning deterministic finite automata (DFA), the seven Tomita grammars. We are interested in the capability of recurrent models with different architectures in learning and expressing regular grammars, which can be the building blocks for many applications dealing with structural data. Our experiments show that a second-order RNN provides the best and stablest performance of extracting DFA over all Tomita grammars and that other RNN models are greatly influenced by different Tomita grammars. To better understand these results, we provide a theoretical analysis of the "complexity" of different grammars, by introducing the entropy and the averaged edit distance of regular grammars defined in this paper. Through our analysis, we categorize all Tomita grammars into different classes, which explains the inconsistency in the performance of extraction observed across all RNN models.

Via

Access Paper or Ask Questions

An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks

Nov 28, 2017

Qinglong Wang, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles

Abstract:Rule extraction from black-box models is critical in domains that require model validation before implementation, as can be the case in credit scoring and medical diagnosis. Though already a challenging problem in statistical learning in general, the difficulty is even greater when highly non-linear, recursive models, such as recurrent neural networks (RNNs), are fit to data. Here, we study the extraction of rules from second order recurrent neural networks trained to recognize the Tomita grammars. We show that production rules can be stably extracted from trained RNNs and that in certain cases the rules outperform the trained RNNs.

Via

Access Paper or Ask Questions

Learning Adversary-Resistant Deep Neural Networks

Aug 19, 2017

Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles

Figure 1 for Learning Adversary-Resistant Deep Neural Networks

Figure 2 for Learning Adversary-Resistant Deep Neural Networks

Figure 3 for Learning Adversary-Resistant Deep Neural Networks

Figure 4 for Learning Adversary-Resistant Deep Neural Networks

Abstract:Deep neural networks (DNNs) have proven to be quite effective in a vast array of machine learning tasks, with recent examples in cyber security and autonomous vehicles. Despite the superior performance of DNNs in these applications, it has been recently shown that these models are susceptible to a particular type of attack that exploits a fundamental flaw in their design. This attack consists of generating particular synthetic examples referred to as adversarial samples. These samples are constructed by slightly manipulating real data-points in order to "fool" the original DNN model, forcing it to mis-classify previously correctly classified samples with high confidence. Addressing this flaw in the model is essential if DNNs are to be used in critical applications such as those in cyber security. Previous work has provided various learning algorithms to enhance the robustness of DNN models, and they all fall into the tactic of "security through obscurity". This means security can be guaranteed only if one can obscure the learning algorithms from adversaries. Once the learning technique is disclosed, DNNs protected by these defense mechanisms are still susceptible to adversarial samples. In this work, we investigate this issue shared across previous research work and propose a generic approach to escalate a DNN's resistance to adversarial samples. More specifically, our approach integrates a data transformation module with a DNN, making it robust even if we reveal the underlying learning algorithm. To demonstrate the generality of our proposed approach and its potential for handling cyber security applications, we evaluate our method and several other existing solutions on datasets publicly available. Our results indicate that our approach typically provides superior classification performance and resistance in comparison with state-of-art solutions.

Via

Access Paper or Ask Questions