Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gang Yang

YUNet: Improved YOLOv11 Network for Skyline Detection

Feb 18, 2025

Gang Yang, Miao Wang, Quan Zhou, Jiangchuan Li

Abstract:Skyline detection plays an important role in geolocalizaion, flight control, visual navigation, port security, etc. The appearance of the sky and non-sky areas are variable, because of different weather or illumination environment, which brings challenges to skyline detection. In this research, we proposed the YUNet algorithm, which improved the YOLOv11 architecture to segment the sky region and extract the skyline in complicated and variable circumstances. To improve the ability of multi-scale and large range contextual feature fusion, the YOLOv11 architecture is extended as an UNet-like architecture, consisting of an encoder, neck and decoder submodule. The encoder extracts the multi-scale features from the given images. The neck makes fusion of these multi-scale features. The decoder applies the fused features to complete the prediction rebuilding. To validate the proposed approach, the YUNet was tested on Skyfinder and CH1 datasets for segmentation and skyline detection respectively. Our test shows that the IoU of YUnet segmentation can reach 0.9858, and the average error of YUnet skyline detection is just 1.36 pixels. The implementation is published at https://github.com/kuazhangxiaoai/SkylineDet-YOLOv11Seg.git.

Via

Access Paper or Ask Questions

Advancing Single- and Multi-task Text Classification through Large Language Model Fine-tuning

Dec 11, 2024

Hang Zhao, Qile P. Chen, Yijing Barry Zhang, Gang Yang

Abstract:Both encoder-only models (e.g., BERT, RoBERTa) and large language models (LLMs, e.g., Llama3) have been widely used for text classification tasks. However, there is a lack of systematic studies comparing the performance of encoder-based models and LLMs in text classification, particularly when fine-tuning is involved. This study employed a diverse range of models and methods, varying in size and architecture, and including both fine-tuned and pre-trained approaches. We first assessed the performances of these LLMs on the 20 Newsgroups (20NG) and MASSIVE datasets, comparing them to encoder-only RoBERTa models. Additionally, we explored the multi-task capabilities of both model types by combining multiple classification tasks, including intent detection and slot-filling, into a single model using data from both datasets. Our results indicate that fully fine-tuned Llama3-70B models outperform RoBERTa-large and other decoder LLMs across various classification tasks and datasets. Moreover, the consolidated multi-task fine-tuned LLMs matched the performance of dual-model setups in both tasks across both datasets. Overall, our study provides a comprehensive benchmark of encoder-only and LLM models on text classification tasks and demonstrates a method to combine two or more fully fine-tuned decoder LLMs for reduced latency and equivalent performance.

* 9 pages, 3 tables

Via

Access Paper or Ask Questions

Stable Object Placement Under Geometric Uncertainty via Differentiable Contact Dynamics

Sep 26, 2024

Linfeng Li, Gang Yang, Lin Shao, David Hsu

Abstract:From serving a cup of coffee to carefully rearranging delicate items, stable object placement is a crucial skill for future robots. This skill is challenging due to the required accuracy, which is difficult to achieve under geometric uncertainty. We leverage differentiable contact dynamics to develop a principled method for stable object placement under geometric uncertainty. We estimate the geometric uncertainty by minimizing the discrepancy between the force-torque sensor readings and the model predictions through gradient descent. We further keep track of a belief over multiple possible geometric parameters to mitigate the gradient-based method's sensitivity to the initialization. We verify our approach in the real world on various geometric uncertainties, including the in-hand pose uncertainty of the grasped object, the object's shape uncertainty, and the environment's shape uncertainty.

Via

Access Paper or Ask Questions

Low-Complexity Joint Azimuth-Range-Velocity Estimation for Integrated Sensing and Communication with OFDM Waveform

May 15, 2024

Jun Zhang, Gang Yang, Qibin Ye, Yixuan Huang, Su Hu

Abstract:Integrated sensing and communication (ISAC) is a main application scenario of the sixth-generation mobile communication systems. Due to the fast-growing number of antennas and subcarriers in cellular systems, the computational complexity of joint azimuth-range-velocity estimation (JARVE) in ISAC systems is extremely high. This paper studies the JARVE problem for a monostatic ISAC system with orthogonal frequency division multiplexing (OFDM) waveform, in which a base station receives the echos of its transmitted cellular OFDM signals to sense multiple targets. The Cramer-Rao bounds are first derived for JARVE. A low-complexity algorithm is further designed for super-resolution JARVE, which utilizes the proposed iterative subspace update scheme and Levenberg-Marquardt optimization method to replace the exhaustive search of spatial spectrum in multiple-signal-classification (MUSIC) algorithm. Finally, with the practical parameters of 5G New Radio, simulation results verify that the proposed algorithm can reduce the computational complexity by three orders of magnitude and two orders of magnitude compared to the existing three-dimensional MUSIC algorithm and estimation-of-signal-parameters-using-rotational-invariance-techniques (ESPRIT) algorithm, respectively, and also improve the estimation performance.

* 16 pages, 12 figures, submitted to IEEE journal

Via

Access Paper or Ask Questions

ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots

May 11, 2024

Zhixuan Xu, Chongkai Gao, Zixuan Liu, Gang Yang, Chenrui Tie, Haozhuo Zheng, Haoyu Zhou, Weikun Peng, Debang Wang, Tianyi Chen(+2 more)

Figure 1 for ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots

Figure 2 for ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots

Figure 3 for ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots

Figure 4 for ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots

Abstract:To substantially enhance robot intelligence, there is a pressing need to develop a large model that enables general-purpose robots to proficiently undertake a broad spectrum of manipulation tasks, akin to the versatile task-planning ability exhibited by LLMs. The vast diversity in objects, robots, and manipulation tasks presents huge challenges. Our work introduces a comprehensive framework to develop a foundation model for general robotic manipulation that formalizes a manipulation task as contact synthesis. Specifically, our model takes as input object and robot manipulator point clouds, object physical attributes, target motions, and manipulation region masks. It outputs contact points on the object and associated contact forces or post-contact motions for robots to achieve the desired manipulation task. We perform extensive experiments both in the simulation and real-world settings, manipulating articulated rigid objects, rigid objects, and deformable objects that vary in dimensionality, ranging from one-dimensional objects like ropes to two-dimensional objects like cloth and extending to three-dimensional objects such as plasticine. Our model achieves average success rates of around 90\%. Supplementary materials and videos are available on our project website at https://manifoundationmodel.github.io/.

Via

Access Paper or Ask Questions

Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation

Mar 02, 2024

Fanzhe Yan, Gang Yang, Yu Li, Aiping Liu, Xun Chen

Figure 1 for Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation

Figure 2 for Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation

Figure 3 for Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation

Figure 4 for Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation

Abstract:Deep learning techniques have demonstrated great potential for accurately estimating brain age by analyzing Magnetic Resonance Imaging (MRI) data from healthy individuals. However, current methods for brain age estimation often directly utilize whole input images, overlooking two important considerations: 1) the heterogeneous nature of brain aging, where different brain regions may degenerate at different rates, and 2) the existence of age-independent redundancies in brain structure. To overcome these limitations, we propose a Dual Graph Attention based Disentanglement Multi-instance Learning (DGA-DMIL) framework for improving brain age estimation. Specifically, the 3D MRI data, treated as a bag of instances, is fed into a 2D convolutional neural network backbone, to capture the unique aging patterns in MRI. A dual graph attention aggregator is then proposed to learn the backbone features by exploiting the intra- and inter-instance relationships. Furthermore, a disentanglement branch is introduced to separate age-related features from age-independent structural representations to ameliorate the interference of redundant information on age prediction. To verify the effectiveness of the proposed framework, we evaluate it on two datasets, UK Biobank and ADNI, containing a total of 35,388 healthy individuals. Our proposed model demonstrates exceptional accuracy in estimating brain age, achieving a remarkable mean absolute error of 2.12 years in the UK Biobank. The results establish our approach as state-of-the-art compared to other competing brain age estimation models. In addition, the instance contribution scores identify the varied importance of brain areas for aging prediction, which provides deeper insights into the understanding of brain aging.

* 12 pages, 9 figures

Via

Access Paper or Ask Questions

RIS Empowered Near-Field Covert Communications

Jan 24, 2024

Jun Liu, Gang Yang, Yuanwei Liu, Xiangyun Zhou

Figure 1 for RIS Empowered Near-Field Covert Communications

Figure 2 for RIS Empowered Near-Field Covert Communications

Figure 3 for RIS Empowered Near-Field Covert Communications

Figure 4 for RIS Empowered Near-Field Covert Communications

Abstract:This paper studies an extremely large-scale reconfigurable intelligent surface (XL-RIS) empowered covert communication system in the near-field region. Alice covertly transmits messages to Bob with the assistance of the XL-RIS, while evading detection by Willie. To enhance the covert communication performance, we maximize the achievable covert rate by jointly optimizing the hybrid analog and digital beamformers at Alice, as well as the reflection coefficient matrix at the XL-RIS. An alternating optimization algorithm is proposed to solve the joint beamforming design problem. For the hybrid beamformer design, a semi-closed-form solution for fully digital beamformer is first obtained by a weighted minimum mean-square error based algorithm, then the baseband digital and analog beamformers at Alice are designed by approximating the fully digital beamformer via manifold optimization. For the XL-RIS's reflection coefficient matrix design, a low-complexity alternating direction method of multipliers based algorithm is proposed to address the challenge of large-scale variables and unit-modulus constraints. Numerical results unveil that i) the near-field communications can achieve a higher covert rate than the far-field covert communications in general, and still realize covert transmission even if Willie is located at the same direction as Bob and closer to the XL-RIS; ii) the proposed algorithm can enhance the covert rate significantly compared to the benchmark schemes; iii) the proposed algorithm leads to a beam diffraction pattern that can bypass Willie and achieve high-rate covert transmission to Bob.

* 15 pages, 8 figures, submitted to IEEE journal

Via

Access Paper or Ask Questions

Joint Range-Velocity-Azimuth Estimation for OFDM-Based Integrated Sensing and Communication

Dec 20, 2023

Zelin Hu, Qibin Ye, Yixuan Huang, Su Hu, Gang Yang

Figure 1 for Joint Range-Velocity-Azimuth Estimation for OFDM-Based Integrated Sensing and Communication

Figure 2 for Joint Range-Velocity-Azimuth Estimation for OFDM-Based Integrated Sensing and Communication

Figure 3 for Joint Range-Velocity-Azimuth Estimation for OFDM-Based Integrated Sensing and Communication

Figure 4 for Joint Range-Velocity-Azimuth Estimation for OFDM-Based Integrated Sensing and Communication

Abstract:Orthogonal frequency division multiplexing (OFDM)-based integrated sensing and communication (ISAC) is promising for future sixth-generation mobile communication systems. Existing works focus on the joint estimation of the targets' range and velocity for OFDM-based ISAC systems. In contrast, this paper studies the three-dimensional joint estimation (3DJE) of range, velocity, and azimuth for OFDM-based ISAC systems with multiple receive antennas. First, we establish the signal model and derive the Cramer-Rao bounds (CRBs) on the 3DJE. Furthermore, an auto-paired super-resolution 3DJE algorithm is proposed by exploiting the reconstructed observation sub-signal's translational invariance property in the time, frequency, and space domains. Finally, with the 5G New Radio parameter setup, simulation results show that the proposed algorithm achieves better estimation performance and its root mean square error is closer to the root of CRBs than existing methods.

* This manuscript has been submitted to the IEEE journal in 09-Aug-2023

Via

Access Paper or Ask Questions

SoftMAC: Differentiable Soft Body Simulation with Forecast-based Contact Model and Two-way Coupling with Articulated Rigid Bodies and Clothes

Dec 06, 2023

Min Liu, Gang Yang, Siyuan Luo, Chen Yu, Lin Shao

Figure 1 for SoftMAC: Differentiable Soft Body Simulation with Forecast-based Contact Model and Two-way Coupling with Articulated Rigid Bodies and Clothes

Figure 2 for SoftMAC: Differentiable Soft Body Simulation with Forecast-based Contact Model and Two-way Coupling with Articulated Rigid Bodies and Clothes

Figure 3 for SoftMAC: Differentiable Soft Body Simulation with Forecast-based Contact Model and Two-way Coupling with Articulated Rigid Bodies and Clothes

Figure 4 for SoftMAC: Differentiable Soft Body Simulation with Forecast-based Contact Model and Two-way Coupling with Articulated Rigid Bodies and Clothes

Abstract:Differentiable physics simulation provides an avenue for tackling previously intractable challenges through gradient-based optimization, thereby greatly improving the efficiency of solving robotics-related problems. To apply differentiable simulation in diverse robotic manipulation scenarios, a key challenge is to integrate various materials in a unified framework. We present SoftMAC, a differentiable simulation framework coupling soft bodies with articulated rigid bodies and clothes. SoftMAC simulates soft bodies with the continuum-mechanics-based Material Point Method (MPM). We provide a forecast-based contact model for MPM, which greatly reduces artifacts like penetration and unnatural rebound. To couple MPM particles with deformable and non-volumetric clothes meshes, we also propose a penetration tracing algorithm that reconstructs the signed distance field in local area. Based on simulators for each modality and the contact model, we develop a differentiable coupling mechanism to simulate the interactions between soft bodies and the other two types of materials. Comprehensive experiments are conducted to validate the effectiveness and accuracy of the proposed differentiable pipeline in downstream robotic manipulation applications. Supplementary materials and videos are available on our project website at https://sites.google.com/view/softmac.

Via

Access Paper or Ask Questions

Jade: A Differentiable Physics Engine for Articulated Rigid Bodies with Intersection-Free Frictional Contact

Sep 09, 2023

Gang Yang, Siyuan Luo, Lin Shao

Abstract:We present Jade, a differentiable physics engine for articulated rigid bodies. Jade models contacts as the Linear Complementarity Problem (LCP). Compared to existing differentiable simulations, Jade offers features including intersection-free collision simulation and stable LCP solutions for multiple frictional contacts. We use continuous collision detection to detect the time of impact and adopt the backtracking strategy to prevent intersection between bodies with complex geometry shapes. We derive the gradient calculation to ensure the whole simulation process is differentiable under the backtracking mechanism. We modify the popular Dantzig algorithm to get valid solutions under multiple frictional contacts. We conduct extensive experiments to demonstrate the effectiveness of our differentiable physics simulation over a variety of contact-rich tasks.

Via

Access Paper or Ask Questions