Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiahui Xu

Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception

Aug 20, 2024

Jiaru Zhong, Haibao Yu, Tianyi Zhu, Jiahui Xu, Wenxian Yang, Zaiqing Nie, Chao Sun

Abstract:Infrastructure sensors installed at elevated positions offer a broader perception range and encounter fewer occlusions. Integrating both infrastructure and ego-vehicle data through V2X communication, known as vehicle-infrastructure cooperation, has shown considerable advantages in enhancing perception capabilities and addressing corner cases encountered in single-vehicle autonomous driving. However, cooperative perception still faces numerous challenges, including limited communication bandwidth and practical communication interruptions. In this paper, we propose CTCE, a novel framework for cooperative 3D object detection. This framework transmits queries with temporal contexts enhancement, effectively balancing transmission efficiency and performance to accommodate real-world communication conditions. Additionally, we propose a temporal-guided fusion module to further improve performance. The roadside temporal enhancement and vehicle-side spatial-temporal fusion together constitute a multi-level temporal contexts integration mechanism, fully leveraging temporal information to enhance performance. Furthermore, a motion-aware reconstruction module is introduced to recover lost roadside queries due to communication interruptions. Experimental results on V2X-Seq and V2X-Sim datasets demonstrate that CTCE outperforms the baseline QUEST, achieving improvements of 3.8% and 1.3% in mAP, respectively. Experiments under communication interruption conditions validate CTCE's robustness to communication interruptions.

* Accepted by IEEE ITSC 2024

Via

Access Paper or Ask Questions

Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

Jun 03, 2024

Jiahui Xu, Feng Jiang, Anningzhe Gao, Haizhou Li

Figure 1 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

Figure 2 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

Figure 3 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

Figure 4 for Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation

Abstract:The advancement of large language models (LLMs) has propelled the development of dialogue systems. Unlike the popular ChatGPT-like assistant model, which only satisfies the user's preferences, task-oriented dialogue systems have also faced new requirements and challenges in the broader business field. They are expected to provide correct responses at each dialogue turn, at the same time, achieve the overall goal defined by the task. By understanding rhetorical structures and topic structures via topic segmentation and discourse parsing, a dialogue system may do a better planning to achieve both objectives. However, while both structures belong to discourse structure in linguistics, rhetorical structure and topic structure are mostly modeled separately or with one assisting the other in the prior work. The interaction between these two structures has not been considered for joint modeling and mutual learning. Furthermore, unsupervised learning techniques to achieve the above are not well explored. To fill this gap, we propose an unsupervised mutual learning framework of two structures leveraging the global and local connections between them. We extend the topic modeling between non-adjacent discourse units to ensure global structural relevance with rhetorical structures. We also incorporate rhetorical structures into the topic structure through a graph neural network model to ensure local coherence consistency. Finally, we utilize the similarity between the two fused structures for mutual learning. The experimental results demonstrate that our methods outperform all strong baselines on two dialogue rhetorical datasets (STAC and Molweni), as well as dialogue topic datasets (Doc2Dial and TIAGE). We provide our code at https://github.com/Jeff-Sue/URT.

Via

Access Paper or Ask Questions

Uncertainty-Aware Prediction and Application in Planning for Autonomous Driving: Definitions, Methods, and Comparison

Mar 04, 2024

Wenbo Shao, Jiahui Xu, Zhong Cao, Hong Wang, Jun Li

Abstract:Autonomous driving systems face the formidable challenge of navigating intricate and dynamic environments with uncertainty. This study presents a unified prediction and planning framework that concurrently models short-term aleatoric uncertainty (SAU), long-term aleatoric uncertainty (LAU), and epistemic uncertainty (EU) to predict and establish a robust foundation for planning in dynamic contexts. The framework uses Gaussian mixture models and deep ensemble methods, to concurrently capture and assess SAU, LAU, and EU, where traditional methods do not integrate these uncertainties simultaneously. Additionally, uncertainty-aware planning is introduced, considering various uncertainties. The study's contributions include comparisons of uncertainty estimations, risk modeling, and planning methods in comparison to existing approaches. The proposed methods were rigorously evaluated using the CommonRoad benchmark and settings with limited perception. These experiments illuminated the advantages and roles of different uncertainty factors in autonomous driving processes. In addition, comparative assessments of various uncertainty modeling strategies underscore the benefits of modeling multiple types of uncertainties, thus enhancing planning accuracy and reliability. The proposed framework facilitates the development of methods for UAP and surpasses existing uncertainty-aware risk models, particularly when considering diverse traffic scenarios. Project page: https://swb19.github.io/UAP/.

* 14 pages, 7 figures

Via

Access Paper or Ask Questions

Towards Safe and Reliable Autonomous Driving: Dynamic Occupancy Set Prediction

Feb 29, 2024

Wenbo Shao, Jiahui Xu, Wenhao Yu, Jun Li, Hong Wang

Abstract:In the rapidly evolving field of autonomous driving, accurate trajectory prediction is pivotal for vehicular safety. However, trajectory predictions often deviate from actual paths, particularly in complex and challenging environments, leading to significant errors. To address this issue, our study introduces a novel method for Dynamic Occupancy Set (DOS) prediction, enhancing trajectory prediction capabilities. This method effectively combines advanced trajectory prediction networks with a DOS prediction module, overcoming the shortcomings of existing models. It provides a comprehensive and adaptable framework for predicting the potential occupancy sets of traffic participants. The main contributions of this research include: 1) A novel DOS prediction model tailored for complex scenarios, augmenting traditional trajectory prediction; 2) The development of unique DOS representations and evaluation metrics; 3) Extensive validation through experiments, demonstrating enhanced performance and adaptability. This research contributes to the advancement of safer and more efficient intelligent vehicle and transportation systems.

* 9 pages, 5 figures

Via

Access Paper or Ask Questions

Pose2Gaze: Generating Realistic Human Gaze Behaviour from Full-body Poses using an Eye-body Coordination Model

Dec 19, 2023

Zhiming Hu, Jiahui Xu, Syn Schmitt, Andreas Bulling

Abstract:While generating realistic body movements, e.g., for avatars in virtual reality, is widely studied in computer vision and graphics, the generation of eye movements that exhibit realistic coordination with the body remains under-explored. We first report a comprehensive analysis of the coordination of human eye and full-body movements during everyday activities based on data from the MoGaze and GIMO datasets. We show that eye gaze has strong correlations with head directions and also full-body motions and there exists a noticeable time delay between body and eye movements. Inspired by the analyses, we then present Pose2Gaze -- a novel eye-body coordination model that first uses a convolutional neural network and a spatio-temporal graph convolutional neural network to extract features from head directions and full-body poses respectively and then applies a convolutional neural network to generate realistic eye movements. We compare our method with state-of-the-art methods that predict eye gaze only from head movements for three different generation tasks and demonstrate that Pose2Gaze significantly outperforms these baselines on both datasets with an average improvement of 26.4% and 21.6% in mean angular error, respectively. Our findings underline the significant potential of cross-modal human gaze behaviour analysis and modelling.

Via

Access Paper or Ask Questions

Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation

May 23, 2023

Danqing Luo, Chen Zhang, Jiahui Xu, Bin Wang, Yiming Chen, Yan Zhang, Haizhou Li

Figure 1 for Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation

Figure 2 for Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation

Figure 3 for Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation

Figure 4 for Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation

Abstract:Training or finetuning large-scale language models (LLMs) such as GPT-3 requires substantial computation resources, motivating recent efforts to explore parameter-efficient adaptation to downstream tasks. One practical area of research is to treat these models as black boxes and interact with them through their inference APIs. In this paper, we investigate how to optimize few-shot text classification without accessing the gradients of the LLMs. To achieve this, we treat the black-box model as a feature extractor and train a classifier with the augmented text data. Data augmentation is performed using prompt-based finetuning on an auxiliary language model with a much smaller parameter size than the black-box model. Through extensive experiments on eight text classification datasets, we show that our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners and performs on par with methods that rely on full-model tuning.

Via

Access Paper or Ask Questions

Group-Aware Graph Neural Network for Nationwide City Air Quality Forecasting

Aug 27, 2021

Ling Chen, Jiahui Xu, Binqing Wu, Yuntao Qian, Zhenhong Du, Yansheng Li, Yongjun Zhang

Figure 1 for Group-Aware Graph Neural Network for Nationwide City Air Quality Forecasting

Figure 2 for Group-Aware Graph Neural Network for Nationwide City Air Quality Forecasting

Figure 3 for Group-Aware Graph Neural Network for Nationwide City Air Quality Forecasting

Figure 4 for Group-Aware Graph Neural Network for Nationwide City Air Quality Forecasting

Abstract:The problem of air pollution threatens public health. Air quality forecasting can provide the air quality index hours or even days later, which can help the public to prevent air pollution in advance. Previous works focus on citywide air quality forecasting and cannot solve nationwide city forecasting problem, whose difficulties lie in capturing the latent dependencies between geographically distant but highly correlated cities. In this paper, we propose the group-aware graph neural network (GAGNN), a hierarchical model for nationwide city air quality forecasting. The model constructs a city graph and a city group graph to model the spatial and latent dependencies between cities, respectively. GAGNN introduces differentiable grouping network to discover the latent dependencies among cities and generate city groups. Based on the generated city groups, a group correlation encoding module is introduced to learn the correlations between them, which can effectively capture the dependencies between city groups. After the graph construction, GAGNN implements message passing mechanism to model the dependencies between cities and city groups. The evaluation experiments on Chinese city air quality dataset indicate that our GAGNN outperforms existing forecasting models.

Via

Access Paper or Ask Questions

HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method

Jan 12, 2021

Jiahui Xu, Ling Chen, Mingqi Lv, Chaoqun Zhan, Sanjian Chen, Jian Chang

Figure 1 for HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method

Figure 2 for HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method

Figure 3 for HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method

Figure 4 for HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method

Abstract:Accurately forecasting air quality is critical to protecting general public from lung and heart diseases. This is a challenging task due to the complicated interactions among distinct pollution sources and various other influencing factors. Existing air quality forecasting methods cannot effectively model the diffusion processes of air pollutants between cities and monitoring stations, which may suddenly deteriorate the air quality of a region. In this paper, we propose HighAir, i.e., a hierarchical graph neural network-based air quality forecasting method, which adopts an encoder-decoder architecture and considers complex air quality influencing factors, e.g., weather and land usage. Specifically, we construct a city-level graph and station-level graphs from a hierarchical perspective, which can consider city-level and station-level patterns, respectively. We design two strategies, i.e., upper delivery and lower updating, to implement the inter-level interactions, and introduce message passing mechanism to implement the intra-level interactions. We dynamically adjust edge weights based on wind direction to model the correlations between dynamic factors and air quality. We compare HighAir with the state-of-the-art air quality forecasting methods on the dataset of Yangtze River Delta city group, which covers 10 major cities within 61,500 km2. The experimental results show that HighAir significantly outperforms other methods.

Via

Access Paper or Ask Questions

Recyclable Waste Identification Using CNN Image Recognition and Gaussian Clustering

Nov 02, 2020

Yuheng Wang, Wen Jie Zhao, Jiahui Xu, Raymond Hong

Figure 1 for Recyclable Waste Identification Using CNN Image Recognition and Gaussian Clustering

Figure 2 for Recyclable Waste Identification Using CNN Image Recognition and Gaussian Clustering

Figure 3 for Recyclable Waste Identification Using CNN Image Recognition and Gaussian Clustering

Figure 4 for Recyclable Waste Identification Using CNN Image Recognition and Gaussian Clustering

Abstract:Waste recycling is an important way of saving energy and materials in the production process. In general cases recyclable objects are mixed with unrecyclable objects, which raises a need for identification and classification. This paper proposes a convolutional neural network (CNN) model to complete both tasks. The model uses transfer learning from a pretrained Resnet-50 CNN to complete feature extraction. A subsequent fully connected layer for classification was trained on the augmented TrashNet dataset [1]. In the application, sliding-window is used for image segmentation in the pre-classification stage. In the post-classification stage, the labelled sample points are integrated with Gaussian Clustering to locate the object. The resulting model has achieved an overall detection rate of 48.4% in simulation and final classification accuracy of 92.4%.

Via

Access Paper or Ask Questions

Uncovering Sociological Effect Heterogeneity using Machine Learning

Sep 18, 2019

Jennie E. Brand, Jiahui Xu, Bernard Koch, Pablo Geraldo

Figure 1 for Uncovering Sociological Effect Heterogeneity using Machine Learning

Figure 2 for Uncovering Sociological Effect Heterogeneity using Machine Learning

Figure 3 for Uncovering Sociological Effect Heterogeneity using Machine Learning

Figure 4 for Uncovering Sociological Effect Heterogeneity using Machine Learning

Abstract:Individuals do not respond uniformly to treatments, events, or interventions. Sociologists routinely partition samples into subgroups to explore how the effects of treatments vary by covariates like race, gender, and socioeconomic status. In so doing, analysts determine the key subpopulations based on theoretical priors. Data-driven discoveries are also routine, yet the analyses by which sociologists typically go about them are problematic and seldom move us beyond our expectations, and biases, to explore new meaningful subgroups. Emerging machine learning methods allow researchers to explore sources of variation that they may not have previously considered, or envisaged. In this paper, we use causal trees to recursively partition the sample and uncover sources of treatment effect heterogeneity. We use honest estimation, splitting the sample into a training sample to grow the tree and an estimation sample to estimate leaf-specific effects. Assessing a central topic in the social inequality literature, college effects on wages, we compare what we learn from conventional approaches for exploring variation in effects to causal trees. Given our use of observational data, we use leaf-specific matching and sensitivity analyses to address confounding and offer interpretations of effects based on observed and unobserved heterogeneity. We encourage researchers to follow similar practices in their work on variation in sociological effects.

Via

Access Paper or Ask Questions