Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yong Deng

Institute of Fundamental and Frontier Science, University of Electronic Science and Technology of China, Chengdu, China

BANet: Bilateral Aggregation Network for Mobile Stereo Matching

Mar 05, 2025

Gangwei Xu, Jiaxin Liu, Xianqi Wang, Junda Cheng, Yong Deng, Jinliang Zang, Yurui Chen, Xin Yang

Abstract:State-of-the-art stereo matching methods typically use costly 3D convolutions to aggregate a full cost volume, but their computational demands make mobile deployment challenging. Directly applying 2D convolutions for cost aggregation often results in edge blurring, detail loss, and mismatches in textureless regions. Some complex operations, like deformable convolutions and iterative warping, can partially alleviate this issue; however, they are not mobile-friendly, limiting their deployment on mobile devices. In this paper, we present a novel bilateral aggregation network (BANet) for mobile stereo matching that produces high-quality results with sharp edges and fine details using only 2D convolutions. Specifically, we first separate the full cost volume into detailed and smooth volumes using a spatial attention map, then perform detailed and smooth aggregations accordingly, ultimately fusing both to obtain the final disparity map. Additionally, to accurately identify high-frequency detailed regions and low-frequency smooth/textureless regions, we propose a new scale-aware spatial attention module. Experimental results demonstrate that our BANet-2D significantly outperforms other mobile-friendly methods, achieving 35.3\% higher accuracy on the KITTI 2015 leaderboard than MobileStereoNet-2D, with faster runtime on mobile devices. The extended 3D version, BANet-3D, achieves the highest accuracy among all real-time methods on high-end GPUs. Code: \textcolor{magenta}{https://github.com/gangweiX/BANet}.

* 12 pages

Via

Access Paper or Ask Questions

MonSter: Marry Monodepth to Stereo Unleashes Power

Jan 15, 2025

Junda Cheng, Longliang Liu, Gangwei Xu, Xianqi Wang, Zhaoxing Zhang, Yong Deng, Jinliang Zang, Yurui Chen, Zhipeng Cai, Xin Yang

Abstract:Stereo matching recovers depth from image correspondences. Existing methods struggle to handle ill-posed regions with limited matching cues, such as occlusions and textureless areas. To address this, we propose MonSter, a novel method that leverages the complementary strengths of monocular depth estimation and stereo matching. MonSter integrates monocular depth and stereo matching into a dual-branch architecture to iteratively improve each other. Confidence-based guidance adaptively selects reliable stereo cues for monodepth scale-shift recovery. The refined monodepth is in turn guides stereo effectively at ill-posed regions. Such iterative mutual enhancement enables MonSter to evolve monodepth priors from coarse object-level structures to pixel-level geometry, fully unlocking the potential of stereo matching. As shown in Fig.1, MonSter ranks 1st across five most commonly used leaderboards -- SceneFlow, KITTI 2012, KITTI 2015, Middlebury, and ETH3D. Achieving up to 49.5% improvements (Bad 1.0 on ETH3D) over the previous best method. Comprehensive analysis verifies the effectiveness of MonSter in ill-posed regions. In terms of zero-shot generalization, MonSter significantly and consistently outperforms state-of-the-art across the board. The code is publicly available at: https://github.com/Junda24/MonSter.

Via

Access Paper or Ask Questions

StereoGen: High-quality Stereo Image Generation from a Single Image

Jan 15, 2025

Xianqi Wang, Hao Yang, Gangwei Xu, Junda Cheng, Min Lin, Yong Deng, Jinliang Zang, Yurui Chen, Xin Yang

Abstract:State-of-the-art supervised stereo matching methods have achieved amazing results on various benchmarks. However, these data-driven methods suffer from generalization to real-world scenarios due to the lack of real-world annotated data. In this paper, we propose StereoGen, a novel pipeline for high-quality stereo image generation. This pipeline utilizes arbitrary single images as left images and pseudo disparities generated by a monocular depth estimation model to synthesize high-quality corresponding right images. Unlike previous methods that fill the occluded area in warped right images using random backgrounds or using convolutions to take nearby pixels selectively, we fine-tune a diffusion inpainting model to recover the background. Images generated by our model possess better details and undamaged semantic structures. Besides, we propose Training-free Confidence Generation and Adaptive Disparity Selection. The former suppresses the negative effect of harmful pseudo ground truth during stereo training, while the latter helps generate a wider disparity distribution and better synthetic images. Experiments show that models trained under our pipeline achieve state-of-the-art zero-shot generalization results among all published methods. The code will be available upon publication of the paper.

Via

Access Paper or Ask Questions

Reliability Assessment of Information Sources Based on Random Permutation Set

Oct 30, 2024

Juntao Xu, Tianxiang Zhan, Yong Deng

Abstract:In pattern recognition, handling uncertainty is a critical challenge that significantly affects decision-making and classification accuracy. Dempster-Shafer Theory (DST) is an effective reasoning framework for addressing uncertainty, and the Random Permutation Set (RPS) extends DST by additionally considering the internal order of elements, forming a more ordered extension of DST. However, there is a lack of a transformation method based on permutation order between RPS and DST, as well as a sequence-based probability transformation method for RPS. Moreover, the reliability of RPS sources remains an issue that requires attention. To address these challenges, this paper proposes an RPS transformation approach and a probability transformation method tailored for RPS. On this basis, a reliability computation method for RPS sources, based on the RPS probability transformation, is introduced and applied to pattern recognition. Experimental results demonstrate that the proposed approach effectively bridges the gap between DST and RPS and achieves superior recognition accuracy in classification problems.

* 10 pages

Via

Access Paper or Ask Questions

Evaluating Evidential Reliability In Pattern Recognition Based On Intuitionistic Fuzzy Sets

Oct 30, 2024

Juntao Xu, Tianxiang Zhan, Yong Deng

Figure 1 for Evaluating Evidential Reliability In Pattern Recognition Based On Intuitionistic Fuzzy Sets

Figure 2 for Evaluating Evidential Reliability In Pattern Recognition Based On Intuitionistic Fuzzy Sets

Figure 3 for Evaluating Evidential Reliability In Pattern Recognition Based On Intuitionistic Fuzzy Sets

Figure 4 for Evaluating Evidential Reliability In Pattern Recognition Based On Intuitionistic Fuzzy Sets

Abstract:Determining the reliability of evidence sources is a crucial topic in Dempster-Shafer theory (DST). Previous approaches have addressed high conflicts between evidence sources using discounting methods, but these methods may not ensure the high efficiency of classification models. In this paper, we consider the combination of DS theory and Intuitionistic Fuzzy Sets (IFS) and propose an algorithm for quantifying the reliability of evidence sources, called Fuzzy Reliability Index (FRI). The FRI algorithm is based on decision quantification rules derived from IFS, defining the contribution of different BPAs to correct decisions and deriving the evidential reliability from these contributions. The proposed method effectively enhances the rationality of reliability estimation for evidence sources, making it particularly suitable for classification decision problems in complex scenarios. Subsequent comparisons with DST-based algorithms and classical machine learning algorithms demonstrate the superiority and generalizability of the FRI algorithm. The FRI algorithm provides a new perspective for future decision probability conversion and reliability analysis of evidence sources.

* 35 pages

Via

Access Paper or Ask Questions

Disambiguating Monocular Reconstruction of 3D Clothed Human with Spatial-Temporal Transformer

Oct 21, 2024

Yong Deng, Baoxing Li, Xu Zhao

Figure 1 for Disambiguating Monocular Reconstruction of 3D Clothed Human with Spatial-Temporal Transformer

Figure 2 for Disambiguating Monocular Reconstruction of 3D Clothed Human with Spatial-Temporal Transformer

Figure 3 for Disambiguating Monocular Reconstruction of 3D Clothed Human with Spatial-Temporal Transformer

Figure 4 for Disambiguating Monocular Reconstruction of 3D Clothed Human with Spatial-Temporal Transformer

Abstract:Reconstructing 3D clothed humans from monocular camera data is highly challenging due to viewpoint limitations and image ambiguity. While implicit function-based approaches, combined with prior knowledge from parametric models, have made significant progress, there are still two notable problems. Firstly, the back details of human models are ambiguous due to viewpoint invisibility. The quality of the back details depends on the back normal map predicted by a convolutional neural network (CNN). However, the CNN lacks global information awareness for comprehending the back texture, resulting in excessively smooth back details. Secondly, a single image suffers from local ambiguity due to lighting conditions and body movement. However, implicit functions are highly sensitive to pixel variations in ambiguous regions. To address these ambiguities, we propose the Spatial-Temporal Transformer (STT) network for 3D clothed human reconstruction. A spatial transformer is employed to extract global information for normal map prediction. The establishment of global correlations facilitates the network in comprehending the holistic texture and shape of the human body. Simultaneously, to compensate for local ambiguity in images, a temporal transformer is utilized to extract temporal features from adjacent frames. The incorporation of temporal features can enhance the accuracy of input features in implicit networks. Furthermore, to obtain more accurate temporal features, joint tokens are employed to establish local correspondences between frames. Experimental results on the Adobe and MonoPerfCap datasets have shown that our method outperforms state-of-the-art methods and maintains robust generalization even under low-light outdoor conditions.

Via

Access Paper or Ask Questions

Transferable Belief Model on Quantum Circuits

Oct 11, 2024

Qianli Zhou, Hao Luo, Lipeng Pan, Yong Deng, Eloi Bosse

Figure 1 for Transferable Belief Model on Quantum Circuits

Figure 2 for Transferable Belief Model on Quantum Circuits

Figure 3 for Transferable Belief Model on Quantum Circuits

Figure 4 for Transferable Belief Model on Quantum Circuits

Abstract:The transferable belief model, as a semantic interpretation of Dempster-Shafer theory, enables agents to perform reasoning and decision making in imprecise and incomplete environments. The model offers distinct semantics for handling unreliable testimonies, allowing for a more reasonable and general process of belief transfer compared to the Bayesian approach. However, because both the belief masses and the structure of focal sets must be considered when updating belief functions-leading to extra computational complexity during reasoning-the transferable belief model has gradually lost favor among researchers in recent developments. In this paper, we implement the transferable belief model on quantum circuits and demonstrate that belief functions offer a more concise and effective alternative to Bayesian approaches within the quantum computing framework. Furthermore, leveraging the unique characteristics of quantum computing, we propose several novel belief transfer approaches. More broadly, this paper introduces a new perspective on basic information representation for quantum AI models, suggesting that belief functions are more suitable than Bayesian approach for handling uncertainty on quantum circuits.

Via

Access Paper or Ask Questions

Time Evidence Fusion Network: Multi-source View in Long-Term Time Series Forecasting

May 10, 2024

Tianxiang Zhan, Yuanpeng He, Zhen Li, Yong Deng

Abstract:In real-world scenarios, time series forecasting often demands timeliness, making research on model backbones a perennially hot topic. To meet these performance demands, we propose a novel backbone from the perspective of information fusion. Introducing the Basic Probability Assignment (BPA) Module and the Time Evidence Fusion Network (TEFN), based on evidence theory, allows us to achieve superior performance. On the other hand, the perspective of multi-source information fusion effectively improves the accuracy of forecasting. Due to the fact that BPA is generated by fuzzy theory, TEFN also has considerable interpretability. In real data experiments, the TEFN partially achieved state-of-the-art, with low errors comparable to PatchTST, and operating efficiency surpass performance models such as Dlinear. Meanwhile, TEFN has high robustness and small error fluctuations in the random hyperparameter selection. TEFN is not a model that achieves the ultimate in single aspect, but a model that balances performance, accuracy, stability, and interpretability.

Via

Access Paper or Ask Questions

Isopignistic Canonical Decomposition via Belief Evolution Network

May 04, 2024

Qianli Zhou, Tianxiang Zhan, Yong Deng

Abstract:Developing a general information processing model in uncertain environments is fundamental for the advancement of explainable artificial intelligence. Dempster-Shafer theory of evidence is a well-known and effective reasoning method for representing epistemic uncertainty, which is closely related to subjective probability theory and possibility theory. Although they can be transformed to each other under some particular belief structures, there remains a lack of a clear and interpretable transformation process, as well as a unified approach for information processing. In this paper, we aim to address these issues from the perspectives of isopignistic belief functions and the hyper-cautious transferable belief model. Firstly, we propose an isopignistic transformation based on the belief evolution network. This transformation allows for the adjustment of the information granule while retaining the potential decision outcome. The isopignistic transformation is integrated with a hyper-cautious transferable belief model to establish a new canonical decomposition. This decomposition offers a reverse path between the possibility distribution and its isopignistic mass functions. The result of the canonical decomposition, called isopignistic function, is an identical information content distribution to reflect the propensity and relative commitment degree of the BPA. Furthermore, this paper introduces a method to reconstruct the basic belief assignment by adjusting the isopignistic function. It explores the advantages of this approach in modeling and handling uncertainty within the hyper-cautious transferable belief model. More general, this paper establishes a theoretical basis for building general models of artificial intelligence based on probability theory, Dempster-Shafer theory, and possibility theory.

Via

Access Paper or Ask Questions

Random Walk in Random Permutation Set Theory

Apr 05, 2024

Jiefeng Zhou, Zhen Li, Yong Deng

Abstract:Random walk is an explainable approach for modeling natural processes at the molecular level. The Random Permutation Set Theory (RPST) serves as a framework for uncertainty reasoning, extending the applicability of Dempster-Shafer Theory. Recent explorations indicate a promising link between RPST and random walk. In this study, we conduct an analysis and construct a random walk model based on the properties of RPST, with Monte Carlo simulations of such random walk. Our findings reveal that the random walk generated through RPST exhibits characteristics similar to those of a Gaussian random walk and can be transformed into a Wiener process through a specific limiting scaling procedure. This investigation establishes a novel connection between RPST and random walk theory, thereby not only expanding the applicability of RPST, but also demonstrating the potential for combining the strengths of both approaches to improve problem-solving abilities.

* 27 pages, 8 figures

Via

Access Paper or Ask Questions