Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Guohui Zhang

DEMO: A Dynamics-Enhanced Learning Model for Multi-Horizon Trajectory Prediction in Autonomous Vehicles

Dec 30, 2024

Chengyue Wang, Haicheng Liao, Kaiqun Zhu, Guohui Zhang, Zhenning Li

Abstract:Autonomous vehicles (AVs) rely on accurate trajectory prediction of surrounding vehicles to ensure the safety of both passengers and other road users. Trajectory prediction spans both short-term and long-term horizons, each requiring distinct considerations: short-term predictions rely on accurately capturing the vehicle's dynamics, while long-term predictions rely on accurately modeling the interaction patterns within the environment. However current approaches, either physics-based or learning-based models, always ignore these distinct considerations, making them struggle to find the optimal prediction for both short-term and long-term horizon. In this paper, we introduce the Dynamics-Enhanced Learning MOdel (DEMO), a novel approach that combines a physics-based Vehicle Dynamics Model with advanced deep learning algorithms. DEMO employs a two-stage architecture, featuring a Dynamics Learning Stage and an Interaction Learning Stage, where the former stage focuses on capturing vehicle motion dynamics and the latter focuses on modeling interaction. By capitalizing on the respective strengths of both methods, DEMO facilitates multi-horizon predictions for future trajectories. Experimental results on the Next Generation Simulation (NGSIM), Macau Connected Autonomous Driving (MoCAD), Highway Drone (HighD), and nuScenes datasets demonstrate that DEMO outperforms state-of-the-art (SOTA) baselines in both short-term and long-term prediction horizons.

* Accepted by Information Fusion

Via

Access Paper or Ask Questions

Real-time Accident Anticipation for Autonomous Driving Through Monocular Depth-Enhanced 3D Modeling

Sep 02, 2024

Haicheng Liao, Yongkang Li, Chengyue Wang, Songning Lai, Zhenning Li, Zilin Bian, Jaeyoung Lee, Zhiyong Cui, Guohui Zhang, Chengzhong Xu

Figure 1 for Real-time Accident Anticipation for Autonomous Driving Through Monocular Depth-Enhanced 3D Modeling

Figure 2 for Real-time Accident Anticipation for Autonomous Driving Through Monocular Depth-Enhanced 3D Modeling

Figure 3 for Real-time Accident Anticipation for Autonomous Driving Through Monocular Depth-Enhanced 3D Modeling

Figure 4 for Real-time Accident Anticipation for Autonomous Driving Through Monocular Depth-Enhanced 3D Modeling

Abstract:The primary goal of traffic accident anticipation is to foresee potential accidents in real time using dashcam videos, a task that is pivotal for enhancing the safety and reliability of autonomous driving technologies. In this study, we introduce an innovative framework, AccNet, which significantly advances the prediction capabilities beyond the current state-of-the-art (SOTA) 2D-based methods by incorporating monocular depth cues for sophisticated 3D scene modeling. Addressing the prevalent challenge of skewed data distribution in traffic accident datasets, we propose the Binary Adaptive Loss for Early Anticipation (BA-LEA). This novel loss function, together with a multi-task learning strategy, shifts the focus of the predictive model towards the critical moments preceding an accident. {We rigorously evaluate the performance of our framework on three benchmark datasets--Dashcam Accident Dataset (DAD), Car Crash Dataset (CCD), and AnAn Accident Detection (A3D), and DADA-2000 Dataset--demonstrating its superior predictive accuracy through key metrics such as Average Precision (AP) and mean Time-To-Accident (mTTA).

Via

Access Paper or Ask Questions

World Models for Autonomous Driving: An Initial Survey

Mar 05, 2024

Yanchen Guan, Haicheng Liao, Zhenning Li, Guohui Zhang, Chengzhong Xu

Abstract:In the rapidly evolving landscape of autonomous driving, the capability to accurately predict future events and assess their implications is paramount for both safety and efficiency, critically aiding the decision-making process. World models have emerged as a transformative approach, enabling autonomous driving systems to synthesize and interpret vast amounts of sensor data, thereby predicting potential future scenarios and compensating for information gaps. This paper provides an initial review of the current state and prospective advancements of world models in autonomous driving, spanning their theoretical underpinnings, practical applications, and the ongoing research efforts aimed at overcoming existing limitations. Highlighting the significant role of world models in advancing autonomous driving technologies, this survey aspires to serve as a foundational reference for the research community, facilitating swift access to and comprehension of this burgeoning field, and inspiring continued innovation and exploration.

Via

Access Paper or Ask Questions

A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization

Jul 13, 2021

Zhenning Li, Chengzhong Xu, Guohui Zhang

Figure 1 for A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization

Figure 2 for A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization

Figure 3 for A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization

Figure 4 for A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization

Abstract:Inefficient traffic signal control methods may cause numerous problems, such as traffic congestion and waste of energy. Reinforcement learning (RL) is a trending data-driven approach for adaptive traffic signal control in complex urban traffic networks. Although the development of deep neural networks (DNN) further enhances its learning capability, there are still some challenges in applying deep RLs to transportation networks with multiple signalized intersections, including non-stationarity environment, exploration-exploitation dilemma, multi-agent training schemes, continuous action spaces, etc. In order to address these issues, this paper first proposes a multi-agent deep deterministic policy gradient (MADDPG) method by extending the actor-critic policy gradient algorithms. MADDPG has a centralized learning and decentralized execution paradigm in which critics use additional information to streamline the training process, while actors act on their own local observations. The model is evaluated via simulation on the Simulation of Urban MObility (SUMO) platform. Model comparison results show the efficiency of the proposed algorithm in controlling traffic lights.

Via

Access Paper or Ask Questions

Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Apr 20, 2021

Zhenning Li, Hao Yu, Guohui Zhang, Shangjia Dong, Cheng-Zhong Xu

Figure 1 for Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Figure 2 for Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Figure 3 for Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Figure 4 for Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Abstract:Inefficient traffic control may cause numerous problems such as traffic congestion and energy waste. This paper proposes a novel multi-agent reinforcement learning method, named KS-DDPG (Knowledge Sharing Deep Deterministic Policy Gradient) to achieve optimal control by enhancing the cooperation between traffic signals. By introducing the knowledge-sharing enabled communication protocol, each agent can access to the collective representation of the traffic environment collected by all agents. The proposed method is evaluated through two experiments respectively using synthetic and real-world datasets. The comparison with state-of-the-art reinforcement learning-based and conventional transportation methods demonstrate the proposed KS-DDPG has significant efficiency in controlling large-scale transportation networks and coping with fluctuations in traffic flow. In addition, the introduced communication mechanism has also been proven to speed up the convergence of the model without significantly increasing the computational burden.

* Transportation Research Part C: Emerging Technologies Volume 125, April 2021, 103059

Via

Access Paper or Ask Questions

Cascade Feature Aggregation for Human Pose Estimation

Apr 08, 2019

Zhihui Su, Ming Ye, Guohui Zhang, Lei Dai, Jianda Sheng

Figure 1 for Cascade Feature Aggregation for Human Pose Estimation

Figure 2 for Cascade Feature Aggregation for Human Pose Estimation

Figure 3 for Cascade Feature Aggregation for Human Pose Estimation

Figure 4 for Cascade Feature Aggregation for Human Pose Estimation

Abstract:Human pose estimation plays an important role in many computer vision tasks and has been studied for many decades. However, due to complex appearance variations from poses, illuminations, occlusions and low resolutions, it still remains a challenging problem. Taking the advantage of high-level semantic information from deep convolutional neural networks is an effective way to improve the accuracy of human pose estimation. In this paper, we propose a novel Cascade Feature Aggregation (CFA) method, which cascades several hourglass networks for robust human pose estimation. Features from different stages are aggregated to obtain abundant contextual information, leading to robustness to poses, partial occlusions and low resolution. Moreover, results from different stages are fused to further improve the localization accuracy. The extensive experiments on MPII datasets and LIP datasets demonstrate that our proposed CFA outperforms the state-of-the-art and achieves the best performance on the state-of-the-art benchmark MPII.

Via

Access Paper or Ask Questions

Cross-domain attribute representation based on convolutional neural network

May 17, 2018

Guohui Zhang, Gaoyuan Liang, Fang Su, Fanxin Qu, Jing-Yan Wang

Figure 1 for Cross-domain attribute representation based on convolutional neural network

Abstract:In the problem of domain transfer learning, we learn a model for the predic-tion in a target domain from the data of both some source domains and the target domain, where the target domain is in lack of labels while the source domain has sufficient labels. Besides the instances of the data, recently the attributes of data shared across domains are also explored and proven to be very helpful to leverage the information of different domains. In this paper, we propose a novel learning framework for domain-transfer learning based on both instances and attributes. We proposed to embed the attributes of dif-ferent domains by a shared convolutional neural network (CNN), learn a domain-independent CNN model to represent the information shared by dif-ferent domains by matching across domains, and a domain-specific CNN model to represent the information of each domain. The concatenation of the three CNN model outputs is used to predict the class label. An iterative algo-rithm based on gradient descent method is developed to learn the parameters of the model. The experiments over benchmark datasets show the advantage of the proposed model.

* arXiv admin note: substantial text overlap with arXiv:1803.09733

Via

Access Paper or Ask Questions

A novel image tag completion method based on convolutional neural network

Jun 03, 2017

Yanyan Geng, Guohui Zhang, Weizhi Li, Yi Gu, Ru-Ze Liang, Gaoyuan Liang, Jingbin Wang, Yanbin Wu, Nitin Patil, Jing-Yan Wang

Figure 1 for A novel image tag completion method based on convolutional neural network

Figure 2 for A novel image tag completion method based on convolutional neural network

Figure 3 for A novel image tag completion method based on convolutional neural network

Abstract:In the problems of image retrieval and annotation, complete textual tag lists of images play critical roles. However, in real-world applications, the image tags are usually incomplete, thus it is important to learn the complete tags for images. In this paper, we study the problem of image tag complete and proposed a novel method for this problem based on a popular image representation method, convolutional neural network (CNN). The method estimates the complete tags from the convolutional filtering outputs of images based on a linear predictor. The CNN parameters, linear predictor, and the complete tags are learned jointly by our method. We build a minimization problem to encourage the consistency between the complete tags and the available incomplete tags, reduce the estimation error, and reduce the model complexity. An iterative algorithm is developed to solve the minimization problem. Experiments over benchmark image data sets show its effectiveness.

Via

Access Paper or Ask Questions