Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shucheng Huang

SAP-CoPE: Social-Aware Planning using Cooperative Pose Estimation with Infrastructure Sensor Nodes

Apr 08, 2025

Minghao Ning, Yufeng Yang, Shucheng Huang, Jiaming Zhong, Keqi Shu, Chen Sun, Ehsan Hashemi, Amir Khajepour

Abstract:Autonomous driving systems must operate safely in human-populated indoor environments, where challenges such as limited perception and occlusion sensitivity arise when relying solely on onboard sensors. These factors generate difficulties in the accurate recognition of human intentions and the generation of comfortable, socially aware trajectories. To address these issues, we propose SAP-CoPE, a social-aware planning framework that integrates cooperative infrastructure with a novel 3D human pose estimation method and a model predictive control-based controller. This real-time framework formulates an optimization problem that accounts for uncertainty propagation in the camera projection matrix while ensuring human joint coherence. The proposed method is adaptable to single- or multi-camera configurations and can incorporate sparse LiDAR point-cloud data. To enhance safety and comfort in human environments, we integrate a human personal space field based on human pose into a model predictive controller, enabling the system to navigate while avoiding discomfort zones. Extensive evaluations in both simulated and real-world settings demonstrate the effectiveness of our approach in generating socially aware trajectories for autonomous systems.

* This paper has been submitted to the IEEE Transactions on Industrial Electronics

Via

Access Paper or Ask Questions

Enhancing Indoor Mobility with Connected Sensor Nodes: A Real-Time, Delay-Aware Cooperative Perception Approach

Nov 04, 2024

Minghao Ning, Yaodong Cui, Yufeng Yang, Shucheng Huang, Zhenan Liu, Ahmad Reza Alghooneh, Ehsan Hashemi, Amir Khajepour

Abstract:This paper presents a novel real-time, delay-aware cooperative perception system designed for intelligent mobility platforms operating in dynamic indoor environments. The system contains a network of multi-modal sensor nodes and a central node that collectively provide perception services to mobility platforms. The proposed Hierarchical Clustering Considering the Scanning Pattern and Ground Contacting Feature based Lidar Camera Fusion improve intra-node perception for crowded environment. The system also features delay-aware global perception to synchronize and aggregate data across nodes. To validate our approach, we introduced the Indoor Pedestrian Tracking dataset, compiled from data captured by two indoor sensor nodes. Our experiments, compared to baselines, demonstrate significant improvements in detection accuracy and robustness against delays. The dataset is available in the repository: https://github.com/NingMingHao/MVSLab-IndoorCooperativePerception

Via

Access Paper or Ask Questions

Intelligent Mobility System with Integrated Motion Planning and Control Utilizing Infrastructure Sensor Nodes

Oct 29, 2024

Yufeng Yang, Minghao Ning, Shucheng Huang, Ehsan Hashemi, Amir Khajepour

Figure 1 for Intelligent Mobility System with Integrated Motion Planning and Control Utilizing Infrastructure Sensor Nodes

Figure 2 for Intelligent Mobility System with Integrated Motion Planning and Control Utilizing Infrastructure Sensor Nodes

Figure 3 for Intelligent Mobility System with Integrated Motion Planning and Control Utilizing Infrastructure Sensor Nodes

Figure 4 for Intelligent Mobility System with Integrated Motion Planning and Control Utilizing Infrastructure Sensor Nodes

Abstract:This paper introduces a framework for an indoor autonomous mobility system that can perform patient transfers and materials handling. Unlike traditional systems that rely on onboard perception sensors, the proposed approach leverages a global perception and localization (PL) through Infrastructure Sensor Nodes (ISNs) and cloud computing technology. Using the global PL, an integrated Model Predictive Control (MPC)-based local planning and tracking controller augmented with Artificial Potential Field (APF) is developed, enabling reliable and efficient motion planning and obstacle avoidance ability while tracking predefined reference motions. Simulation results demonstrate the effectiveness of the proposed MPC controller in smoothly navigating around both static and dynamic obstacles. The proposed system has the potential to extend to intelligent connected autonomous vehicles, such as electric or cargo transport vehicles with four-wheel independent drive/steering (4WID-4WIS) configurations.

Via

Access Paper or Ask Questions

Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

Apr 12, 2022

Ziyue Wu, Junyu Gao, Shucheng Huang, Changsheng Xu

Figure 1 for Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

Figure 2 for Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

Figure 3 for Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

Figure 4 for Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

Abstract:Grounding temporal video segments described in natural language queries effectively and efficiently is a crucial capability needed in vision-and-language fields. In this paper, we deal with the fast video temporal grounding (FVTG) task, aiming at localizing the target segment with high speed and favorable accuracy. Most existing approaches adopt elaborately designed cross-modal interaction modules to improve the grounding performance, which suffer from the test-time bottleneck. Although several common space-based methods enjoy the high-speed merit during inference, they can hardly capture the comprehensive and explicit relations between visual and textual modalities. In this paper, to tackle the dilemma of speed-accuracy tradeoff, we propose a commonsense-aware cross-modal alignment (CCA) framework, which incorporates commonsense-guided visual and text representations into a complementary common space for fast video temporal grounding. Specifically, the commonsense concepts are explored and exploited by extracting the structural semantic information from a language corpus. Then, a commonsense-aware interaction module is designed to obtain bridged visual and text features by utilizing the learned commonsense concepts. Finally, to maintain the original semantic information of textual queries, a cross-modal complementary common space is optimized to obtain matching scores for performing FVTG. Extensive results on two challenging benchmarks show that our CCA method performs favorably against state-of-the-arts while running at high speed. Our code is available at https://github.com/ZiyueWu59/CCA.

* Submitted to IEEE TMM; Code is available at https://github.com/ZiyueWu59/CCA

Via

Access Paper or Ask Questions