Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuxiang Zhao

UAV-Based Remote Sensing of Soil Moisture Across Diverse Land Covers: Validation and Bayesian Uncertainty Characterization

Jun 05, 2025

Runze Zhang, Ishfaq Aziz, Derek Houtz, Yuxiang Zhao, Trent W. Ford, Adam C. Watts, Mohamad Alipour

Abstract:High-resolution soil moisture (SM) observations are critical for agricultural monitoring, forestry management, and hazard prediction, yet current satellite passive microwave missions cannot directly provide retrievals at tens-of-meter spatial scales. Unmanned aerial vehicle (UAV) mounted microwave radiometry presents a promising alternative, but most evaluations to date have focused on agricultural settings, with limited exploration across other land covers and few efforts to quantify retrieval uncertainty. This study addresses both gaps by evaluating SM retrievals from a drone-based Portable L-band Radiometer (PoLRa) across shrubland, bare soil, and forest strips in Central Illinois, U.S., using a 10-day field campaign in 2024. Controlled UAV flights at altitudes of 10 m, 20 m, and 30 m were performed to generate brightness temperatures (TB) at spatial resolutions of 7 m, 14 m, and 21 m. SM retrievals were carried out using multiple tau-omega-based algorithms, including the single channel algorithm (SCA), dual channel algorithm (DCA), and multi-temporal dual channel algorithm (MTDCA). A Bayesian inference framework was then applied to provide probabilistic uncertainty characterization for both SM and vegetation optical depth (VOD). Results show that the gridded TB distributions consistently capture dry-wet gradients associated with vegetation density variations, and spatial correlations between polarized observations are largely maintained across scales. Validation against in situ measurements indicates that PoLRa derived SM retrievals from the SCAV and MTDCA algorithms achieve unbiased root-mean-square errors (ubRMSE) generally below 0.04 m3/m3 across different land covers. Bayesian posterior analyses confirm that reference SM values largely fall within the derived uncertainty intervals, with mean uncertainty ranges around 0.02 m3/m3 and 0.11 m3/m3 for SCA and DCA related retrievals.

Via

Access Paper or Ask Questions

Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Dec 18, 2024

Yanpeng Sun, Jing Hao, Ke Zhu, Jiang-Jiang Liu, Yuxiang Zhao, Xiaofan Li, Gang Zhang, Zechao Li, Jingdong Wang

Figure 1 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Figure 2 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Figure 3 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Figure 4 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Abstract:Training Large Multimodality Models (LMMs) relies on descriptive image caption that connects image and language. Existing methods either distill the caption from the LMM models or construct the captions from the internet images or by human. We propose to leverage off-the-shelf visual specialists, which were trained from annotated images initially not for image captioning, for enhancing the image caption. Our approach, named DCE, explores object low-level and fine-grained attributes (e.g., depth, emotion and fine-grained categories) and object relations (e.g., relative location and human-object-interaction (HOI)), and combine the attributes into the descriptive caption. Experiments demonstrate that such visual specialists are able to improve the performance for visual understanding tasks as well as reasoning that benefits from more accurate visual understanding. We will release the source code and the pipeline so that other visual specialists are easily combined into the pipeline. The complete source code of DCE pipeline and datasets will be available at \url{https://github.com/syp2ysy/DCE}.

* An open-source data engine for generating detailed image captions

Via

Access Paper or Ask Questions

Effective Rank and the Staircase Phenomenon: New Insights into Neural Network Training Dynamics

Dec 06, 2024

Yang Jiang, Yuxiang Zhao, Quanhui Zhu

Abstract:In recent years, deep learning, powered by neural networks, has achieved widespread success in solving high-dimensional problems, particularly those with low-dimensional feature structures. This success stems from their ability to identify and learn low dimensional features tailored to the problems. Understanding how neural networks extract such features during training dynamics remains a fundamental question in deep learning theory. In this work, we propose a novel perspective by interpreting the neurons in the last hidden layer of a neural network as basis functions that represent essential features. To explore the linear independence of these basis functions throughout the deep learning dynamics, we introduce the concept of 'effective rank'. Our extensive numerical experiments reveal a notable phenomenon: the effective rank increases progressively during the learning process, exhibiting a staircase-like pattern, while the loss function concurrently decreases as the effective rank rises. We refer to this observation as the 'staircase phenomenon'. Specifically, for deep neural networks, we rigorously prove the negative correlation between the loss function and effective rank, demonstrating that the lower bound of the loss function decreases with increasing effective rank. Therefore, to achieve a rapid descent of the loss function, it is critical to promote the swift growth of effective rank. Ultimately, we evaluate existing advanced learning methodologies and find that these approaches can quickly achieve a higher effective rank, thereby avoiding redundant staircase processes and accelerating the rapid decline of the loss function.

Via

Access Paper or Ask Questions

PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

Mar 27, 2024

Yuxiang Zhao, Zhuomin Chai, Xun Jiang, Yibo Lin, Runsheng Wang, Ru Huang

Figure 1 for PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

Figure 2 for PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

Figure 3 for PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

Figure 4 for PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

Abstract:IR drop on the power delivery network (PDN) is closely related to PDN's configuration and cell current consumption. As the integrated circuit (IC) design is growing larger, dynamic IR drop simulation becomes computationally unaffordable and machine learning based IR drop prediction has been explored as a promising solution. Although CNN-based methods have been adapted to IR drop prediction task in several works, the shortcomings of overlooking PDN configuration is non-negligible. In this paper, we consider not only how to properly represent cell-PDN relation, but also how to model IR drop following its physical nature in the feature aggregation procedure. Thus, we propose a novel graph structure, PDNGraph, to unify the representations of the PDN structure and the fine-grained cell-PDN relation. We further propose a dual-branch heterogeneous network, PDNNet, incorporating two parallel GNN-CNN branches to favorably capture the above features during the learning process. Several key designs are presented to make the dynamic IR drop prediction highly effective and interpretable. We are the first work to apply graph structure to deep-learning based dynamic IR drop prediction method. Experiments show that PDNNet outperforms the state-of-the-art CNN-based methods by up to 39.3% reduction in prediction error and achieves 545x speedup compared to the commercial tool, which demonstrates the superiority of our method.

Via

Access Paper or Ask Questions

UAS-based Automated Structural Inspection Path Planning via Visual Data Analytics and Optimization

Dec 22, 2023

Yuxiang Zhao, Benhao Lu, Mohamad Alipour

Abstract:Unmanned Aerial Systems (UAS) have gained significant traction for their application in infrastructure inspections. However, considering the enormous scale and complex nature of infrastructure, automation is essential for improving the efficiency and quality of inspection operations. One of the core problems in this regard is electing an optimal automated flight path that can achieve the mission objectives while minimizing flight time. This paper presents an effective formulation for the path planning problem in the context of structural inspections. Coverage is guaranteed as a constraint to ensure damage detectability and path length is minimized as an objective, thus maximizing efficiency while ensuring inspection quality. A two-stage algorithm is then devised to solve the path planning problem, composed of a genetic algorithm for determining the positions of viewpoints and a greedy algorithm for calculating the poses. A comprehensive sensitivity analysis is conducted to demonstrate the proposed algorithm's effectiveness and range of applicability. Applied examples of the algorithm, including partial space inspection with no-fly zones and focused inspection, are also presented, demonstrating the flexibility of the proposed method to meet real-world structural inspection requirements. In conclusion, the results of this study highlight the feasibility of the proposed approach and establish the groundwork for incorporating automation into UAS-based structural inspection mission planning.

Via

Access Paper or Ask Questions

ROMO: Retrieval-enhanced Offline Model-based Optimization

Oct 19, 2023

Mingcheng Chen, Haoran Zhao, Yuxiang Zhao, Hulei Fan, Hongqiao Gao, Yong Yu, Zheng Tian

Abstract:Data-driven black-box model-based optimization (MBO) problems arise in a great number of practical application scenarios, where the goal is to find a design over the whole space maximizing a black-box target function based on a static offline dataset. In this work, we consider a more general but challenging MBO setting, named constrained MBO (CoMBO), where only part of the design space can be optimized while the rest is constrained by the environment. A new challenge arising from CoMBO is that most observed designs that satisfy the constraints are mediocre in evaluation. Therefore, we focus on optimizing these mediocre designs in the offline dataset while maintaining the given constraints rather than further boosting the best observed design in the traditional MBO setting. We propose retrieval-enhanced offline model-based optimization (ROMO), a new derivable forward approach that retrieves the offline dataset and aggregates relevant samples to provide a trusted prediction, and use it for gradient-based optimization. ROMO is simple to implement and outperforms state-of-the-art approaches in the CoMBO setting. Empirically, we conduct experiments on a synthetic Hartmann (3D) function dataset, an industrial CIO dataset, and a suite of modified tasks in the Design-Bench benchmark. Results show that ROMO performs well in a wide range of constrained optimization tasks.

* 15 pages, 9 figures

Via

Access Paper or Ask Questions

HybridNet: Dual-Branch Fusion of Geometrical and Topological Views for VLSI Congestion Prediction

May 07, 2023

Yuxiang Zhao, Zhuomin Chai, Yibo Lin, Runsheng Wang, Ru Huang

Abstract:Accurate early congestion prediction can prevent unpleasant surprises at the routing stage, playing a crucial character in assisting designers to iterate faster in VLSI design cycles. In this paper, we introduce a novel strategy to fully incorporate topological and geometrical features of circuits by making several key designs in our network architecture. To be more specific, we construct two individual graphs (geometry-graph, topology-graph) with distinct edge construction schemes according to their unique properties. We then propose a dual-branch network with different encoder layers in each pathway and aggregate representations with a sophisticated fusion strategy. Our network, named HybridNet, not only provides a simple yet effective way to capture the geometric interactions of cells, but also preserves the original topological relationships in the netlist. Experimental results on the ISPD2015 benchmarks show that we achieve an improvement of 10.9% compared to previous methods.

* 2023 IEEE International Symposium of EDA

Via

Access Paper or Ask Questions

CircuitNet: An Open-Source Dataset for Machine Learning Applications in Electronic Design Automation

Aug 04, 2022

Zhuomin Chai, Yuxiang Zhao, Yibo Lin, Wei Liu, Runsheng Wang, Ru Huang

Figure 1 for CircuitNet: An Open-Source Dataset for Machine Learning Applications in Electronic Design Automation

Abstract:The electronic design automation (EDA) community has been actively exploring machine learning for very-large-scale-integrated computer aided design (VLSI CAD). Many studies have explored learning based techniques for cross-stage prediction tasks in the design flow to achieve faster design convergence. Although building machine learning (ML) models usually requires a large amount of data, most studies can only generate small internal datasets for validation due to the lack of large public datasets. In this essay, we present the first open-source dataset for machine learning tasks in VLSI CAD called CircuitNet. The dataset consists of more than 10K samples extracted from versatile runs of commercial design tools based on 6 open-source RISC-V designs.

Via

Access Paper or Ask Questions

Good Practices and A Strong Baseline for Traffic Anomaly Detection

Jun 04, 2021

Yuxiang Zhao, Wenhao Wu, Yue He, Yingying Li, Xiao Tan, Shifeng Chen

Figure 1 for Good Practices and A Strong Baseline for Traffic Anomaly Detection

Figure 2 for Good Practices and A Strong Baseline for Traffic Anomaly Detection

Figure 3 for Good Practices and A Strong Baseline for Traffic Anomaly Detection

Figure 4 for Good Practices and A Strong Baseline for Traffic Anomaly Detection

Abstract:The detection of traffic anomalies is a critical component of the intelligent city transportation management system. Previous works have proposed a variety of notable insights and taken a step forward in this field, however, dealing with the complex traffic environment remains a challenge. Moreover, the lack of high-quality data and the complexity of the traffic scene, motivate us to study this problem from a hand-crafted perspective. In this paper, we propose a straightforward and efficient framework that includes pre-processing, a dynamic track module, and post-processing. With video stabilization, background modeling, and vehicle detection, the pro-processing phase aims to generate candidate anomalies. The dynamic tracking module seeks and locates the start time of anomalies by utilizing vehicle motion patterns and spatiotemporal status. Finally, we use post-processing to fine-tune the temporal boundary of anomalies. Not surprisingly, our proposed framework was ranked $1^{st}$ in the NVIDIA AI CITY 2021 leaderboard for traffic anomaly detection. The code is available at: https://github.com/Endeavour10020/AICity2021-Anomaly-Detection .

* We rank $1^{st}$ in the CVPR 2021 NVIDIA AI CITY Challenge for Traffic Anomaly detection

Via

Access Paper or Ask Questions

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

May 25, 2021

Wenhao Wu, Yuxiang Zhao, Yanwu Xu, Xiao Tan, Dongliang He, Zhikang Zou, Jin Ye, Yingying Li, Mingde Yao, Zichao Dong(+1 more)

Figure 1 for DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

Figure 2 for DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

Figure 3 for DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

Figure 4 for DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

Abstract:Long-range and short-range temporal modeling are two complementary and crucial aspects of video recognition. Most of the state-of-the-arts focus on short-range spatio-temporal modeling and then average multiple snippet-level predictions to yield the final video-level prediction. Thus, their video-level prediction does not consider spatio-temporal features of how video evolves along the temporal dimension. In this paper, we introduce a novel Dynamic Segment Aggregation (DSA) module to capture relationship among snippets. To be more specific, we attempt to generate a dynamic kernel for a convolutional operation to aggregate long-range temporal information among adjacent snippets adaptively. The DSA module is an efficient plug-and-play module and can be combined with the off-the-shelf clip-based models (i.e., TSM, I3D) to perform powerful long-range modeling with minimal overhead. The final video architecture, coined as DSANet. We conduct extensive experiments on several video recognition benchmarks (i.e., Mini-Kinetics-200, Kinetics-400, Something-Something V1 and ActivityNet) to show its superiority. Our proposed DSA module is shown to benefit various video recognition models significantly. For example, equipped with DSA modules, the top-1 accuracy of I3D ResNet-50 is improved from 74.9% to 78.2% on Kinetics-400. Codes will be available.

* Technical Report

Via

Access Paper or Ask Questions