Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lei Zheng

M-MRE: Extending the Mutual Reinforcement Effect to Multimodal Information Extraction

Apr 24, 2025

Chengguang Gan, Sunbowen Lee, Zhixi Cai, Yanbin Wei, Lei Zheng, Yunhao Liang, Shiwen Ni, Tatsunori Mori

Abstract:Mutual Reinforcement Effect (MRE) is an emerging subfield at the intersection of information extraction and model interpretability. MRE aims to leverage the mutual understanding between tasks of different granularities, enhancing the performance of both coarse-grained and fine-grained tasks through joint modeling. While MRE has been explored and validated in the textual domain, its applicability to visual and multimodal domains remains unexplored. In this work, we extend MRE to the multimodal information extraction domain for the first time. Specifically, we introduce a new task: Multimodal Mutual Reinforcement Effect (M-MRE), and construct a corresponding dataset to support this task. To address the challenges posed by M-MRE, we further propose a Prompt Format Adapter (PFA) that is fully compatible with various Large Vision-Language Models (LVLMs). Experimental results demonstrate that MRE can also be observed in the M-MRE task, a multimodal text-image understanding scenario. This provides strong evidence that MRE facilitates mutual gains across three interrelated tasks, confirming its generalizability beyond the textual domain.

Via

Access Paper or Ask Questions

Occlusion-Aware Consistent Model Predictive Control for Robot Navigation in Occluded Obstacle-Dense Environments

Mar 06, 2025

Minzhe Zheng, Lei Zheng, Lei Zhu, Jun Ma

Abstract:Ensuring safety and motion consistency for robot navigation in occluded, obstacle-dense environments is a critical challenge. In this context, this study presents an occlusion-aware Consistent Model Predictive Control (CMPC) strategy. To account for the occluded obstacles, it incorporates adjustable risk regions that represent their potential future locations. Subsequently, dynamic risk boundary constraints are developed online to ensure safety. The CMPC then constructs multiple locally optimal trajectory branches (each tailored to different risk regions) to balance between exploitation and exploration. A shared consensus trunk is generated to ensure smooth transitions between branches without significant velocity fluctuations, further preserving motion consistency. To facilitate high computational efficiency and ensure coordination across local trajectories, we use the alternating direction method of multipliers (ADMM) to decompose the CMPC into manageable sub-problems for parallel solving. The proposed strategy is validated through simulation and real-world experiments on an Ackermann-steering robot platform. The results demonstrate the effectiveness of the proposed CMPC strategy through comparisons with baseline approaches in occluded, obstacle-dense environments.

Via

Access Paper or Ask Questions

Occlusion-Aware Contingency Safety-Critical Planning for Autonomous Vehicles

Feb 10, 2025

Lei Zheng, Rui Yang, Minzhe Zheng, Zengqi Peng, Michael Yu Wang, Jun Ma

Figure 1 for Occlusion-Aware Contingency Safety-Critical Planning for Autonomous Vehicles

Figure 2 for Occlusion-Aware Contingency Safety-Critical Planning for Autonomous Vehicles

Figure 3 for Occlusion-Aware Contingency Safety-Critical Planning for Autonomous Vehicles

Figure 4 for Occlusion-Aware Contingency Safety-Critical Planning for Autonomous Vehicles

Abstract:Ensuring safe driving while maintaining travel efficiency for autonomous vehicles in dynamic and occluded environments is a critical challenge. This paper proposes an occlusion-aware contingency safety-critical planning approach for real-time autonomous driving in such environments. Leveraging reachability analysis for risk assessment, forward reachable sets of occluded phantom vehicles are computed to quantify dynamic velocity boundaries. These velocity boundaries are incorporated into a biconvex nonlinear programming (NLP) formulation, enabling simultaneous optimization of exploration and fallback trajectories within a receding horizon planning framework. To facilitate real-time optimization and ensure coordination between trajectories, we employ the consensus alternating direction method of multipliers (ADMM) to decompose the biconvex NLP problem into low-dimensional convex subproblems. The effectiveness of the proposed approach is validated through simulation studies and real-world experiments in occluded intersections. Experimental results demonstrate enhanced safety and improved travel efficiency, enabling real-time safe trajectory generation in dynamic occluded intersections under varying obstacle conditions. A video showcasing the experimental results is available at https://youtu.be/CHayG7NChqM.

Via

Access Paper or Ask Questions

Bilevel Multi-Armed Bandit-Based Hierarchical Reinforcement Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Feb 06, 2025

Zengqi Peng, Yubin Wang, Lei Zheng, Jun Ma

Figure 1 for Bilevel Multi-Armed Bandit-Based Hierarchical Reinforcement Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Figure 2 for Bilevel Multi-Armed Bandit-Based Hierarchical Reinforcement Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Figure 3 for Bilevel Multi-Armed Bandit-Based Hierarchical Reinforcement Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Figure 4 for Bilevel Multi-Armed Bandit-Based Hierarchical Reinforcement Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Abstract:In this work, we present BiM-ACPPO, a bilevel multi-armed bandit-based hierarchical reinforcement learning framework for interaction-aware decision-making and planning at unsignalized intersections. Essentially, it proactively takes the uncertainties associated with surrounding vehicles (SVs) into consideration, which encompass those stemming from the driver's intention, interactive behaviors, and the varying number of SVs. Intermediate decision variables are introduced to enable the high-level RL policy to provide an interaction-aware reference, for guiding low-level model predictive control (MPC) and further enhancing the generalization ability of the proposed framework. By leveraging the structured nature of self-driving at unsignalized intersections, the training problem of the RL policy is modeled as a bilevel curriculum learning task, which is addressed by the proposed Exp3.S-based BiMAB algorithm. It is noteworthy that the training curricula are dynamically adjusted, thereby facilitating the sample efficiency of the RL training process. Comparative experiments are conducted in the high-fidelity CARLA simulator, and the results indicate that our approach achieves superior performance compared to all baseline methods. Furthermore, experimental results in two new urban driving scenarios clearly demonstrate the commendable generalization performance of the proposed method.

* This paper has been accepted by IEEE Transactions on Vehicular Technology

Via

Access Paper or Ask Questions

LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models

Jan 09, 2025

Zengqi Peng, Yubin Wang, Xu Han, Lei Zheng, Jun Ma

Abstract:Recent advancements in reinforcement learning (RL) demonstrate the significant potential in autonomous driving. Despite this promise, challenges such as the manual design of reward functions and low sample efficiency in complex environments continue to impede the development of safe and effective driving policies. To tackle these issues, we introduce LearningFlow, an innovative automated policy learning workflow tailored to urban driving. This framework leverages the collaboration of multiple large language model (LLM) agents throughout the RL training process. LearningFlow includes a curriculum sequence generation process and a reward generation process, which work in tandem to guide the RL policy by generating tailored training curricula and reward functions. Particularly, each process is supported by an analysis agent that evaluates training progress and provides critical insights to the generation agent. Through the collaborative efforts of these LLM agents, LearningFlow automates policy learning across a series of complex driving tasks, and it significantly reduces the reliance on manual reward function design while enhancing sample efficiency. Comprehensive experiments are conducted in the high-fidelity CARLA simulator, along with comparisons with other existing methods, to demonstrate the efficacy of our proposed approach. The results demonstrate that LearningFlow excels in generating rewards and curricula. It also achieves superior performance and robust generalization across various driving tasks, as well as commendable adaptation to different RL algorithms.

Via

Access Paper or Ask Questions

Synergizing Decision Making and Trajectory Planning Using Two-Stage Optimization for Autonomous Vehicles

Nov 28, 2024

Wenru Liu, Haichao Liu, Lei Zheng, Zhenmin Huang, Jun Ma

Abstract:This paper introduces a local planner that synergizes the decision making and trajectory planning modules towards autonomous driving. The decision making and trajectory planning tasks are jointly formulated as a nonlinear programming problem with an integrated objective function. However, integrating the discrete decision variables into the continuous trajectory optimization leads to a mixed-integer programming (MIP) problem with inherent nonlinearity and nonconvexity. To address the challenge in solving the problem, the original problem is decomposed into two sub-stages, and a two-stage optimization (TSO) based approach is presented to ensure the coherence in outcomes for the two stages. The optimization problem in the first stage determines the optimal decision sequence that acts as an informed initialization. With the outputs from the first stage, the second stage necessitates the use of a high-fidelity vehicle model and strict enforcement of the collision avoidance constraints as part of the trajectory planning problem. We evaluate the effectiveness of our proposed planner across diverse multi-lane scenarios. The results demonstrate that the proposed planner simultaneously generates a sequence of optimal decisions and the corresponding trajectory that significantly improves driving performance in terms of driving safety and traveling efficiency as compared to alternative methods. Additionally, we implement the closed-loop simulation in CARLA, and the results showcase the effectiveness of the proposed planner to adapt to changing driving situations with high computational efficiency.

Via

Access Paper or Ask Questions

Safe and Real-Time Consistent Planning for Autonomous Vehicles in Partially Observed Environments via Parallel Consensus Optimization

Sep 16, 2024

Lei Zheng, Rui Yang, Minzhe Zheng, Michael Yu Wang, Jun Ma

Abstract:Ensuring safety and driving consistency is a significant challenge for autonomous vehicles operating in partially observed environments. This work introduces a consistent parallel trajectory optimization (CPTO) approach to enable safe and consistent driving in dense obstacle environments with perception uncertainties. Utilizing discrete-time barrier function theory, we develop a consensus safety barrier module that ensures reliable safety coverage within the spatiotemporal trajectory space across potential obstacle configurations. Following this, a bi-convex parallel trajectory optimization problem is derived that facilitates decomposition into a series of low-dimensional quadratic programming problems to accelerate computation. By leveraging the consensus alternating direction method of multipliers (ADMM) for parallel optimization, each generated candidate trajectory corresponds to a possible environment configuration while sharing a common consensus trajectory segment. This ensures driving safety and consistency when executing the consensus trajectory segment for the ego vehicle in real time. We validate our CPTO framework through extensive comparisons with state-of-the-art baselines across multiple driving tasks in partially observable environments. Our results demonstrate improved safety and consistency using both synthetic and real-world traffic datasets.

Via

Access Paper or Ask Questions

Look into the Future: Deep Contextualized Sequential Recommendation

May 23, 2024

Lei Zheng, Ning Li, Yanhuan Huang, Ruiwen Xu, Weinan Zhang, Yong Yu

Figure 1 for Look into the Future: Deep Contextualized Sequential Recommendation

Figure 2 for Look into the Future: Deep Contextualized Sequential Recommendation

Figure 3 for Look into the Future: Deep Contextualized Sequential Recommendation

Figure 4 for Look into the Future: Deep Contextualized Sequential Recommendation

Abstract:Sequential recommendation focuses on mining useful patterns from the user behavior history to better estimate his preference on the candidate items. Previous solutions adopt recurrent networks or retrieval methods to obtain the user's profile representation so as to perform the preference estimation. In this paper, we propose a novel framework of sequential recommendation called Look into the Future (LIFT), which builds and leverages the contexts of sequential recommendation. The context in LIFT refers to a user's current profile that can be represented based on both past and future behaviors. As such, the learned context will be more effective in predicting the user's behaviors in sequential recommendation. Apparently, it is impossible to use real future information to predict the current behavior, we thus propose a novel retrieval-based framework to use the most similar interaction's future information as the future context of the target interaction without data leakage. Furthermore, in order to exploit the intrinsic information embedded within the context itself, we introduce an innovative pretraining methodology incorporating behavior masking. This approach is designed to facilitate the efficient acquisition of context representations. We demonstrate that finding relevant contexts from the global user pool via retrieval methods will greatly improve preference estimation performance. In our extensive experiments over real-world datasets, LIFT demonstrates significant performance improvement on click-through rate prediction tasks in sequential recommendation over strong baselines.

* arXiv admin note: text overlap with arXiv:2404.18304 by other authors

Via

Access Paper or Ask Questions

Retrieval and Distill: A Temporal Data Shift-Free Paradigm for Online Recommendation System

Apr 26, 2024

Lei Zheng, Ning Li, Weinan Zhang, Yong Yu

Abstract:Current recommendation systems are significantly affected by a serious issue of temporal data shift, which is the inconsistency between the distribution of historical data and that of online data. Most existing models focus on utilizing updated data, overlooking the transferable, temporal data shift-free information that can be learned from shifting data. We propose the Temporal Invariance of Association theorem, which suggests that given a fixed search space, the relationship between the data and the data in the search space keeps invariant over time. Leveraging this principle, we designed a retrieval-based recommendation system framework that can train a data shift-free relevance network using shifting data, significantly enhancing the predictive performance of the original model in the recommendation system. However, retrieval-based recommendation models face substantial inference time costs when deployed online. To address this, we further designed a distill framework that can distill information from the relevance network into a parameterized module using shifting data. The distilled model can be deployed online alongside the original model, with only a minimal increase in inference time. Extensive experiments on multiple real datasets demonstrate that our framework significantly improves the performance of the original model by utilizing shifting data.

Via

Access Paper or Ask Questions

Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Mar 20, 2024

Zengqi Peng, Xiao Zhou, Lei Zheng, Yubin Wang, Jun Ma

Figure 1 for Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Figure 2 for Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Figure 3 for Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Figure 4 for Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections

Abstract:In this work, we present a reward-driven automated curriculum reinforcement learning approach for interaction-aware self-driving at unsignalized intersections, taking into account the uncertainties associated with surrounding vehicles (SVs). These uncertainties encompass the uncertainty of SVs' driving intention and also the quantity of SVs. To deal with this problem, the curriculum set is specifically designed to accommodate a progressively increasing number of SVs. By implementing an automated curriculum selection mechanism, the importance weights are rationally allocated across various curricula, thereby facilitating improved sample efficiency and training outcomes. Furthermore, the reward function is meticulously designed to guide the agent towards effective policy exploration. Thus the proposed framework could proactively address the above uncertainties at unsignalized intersections by employing the automated curriculum learning technique that progressively increases task difficulty, and this ensures safe self-driving through effective interaction with SVs. Comparative experiments are conducted in $Highway\_Env$, and the results indicate that our approach achieves the highest task success rate, attains strong robustness to initialization parameters of the curriculum selection module, and exhibits superior adaptability to diverse situational configurations at unsignalized intersections. Furthermore, the effectiveness of the proposed method is validated using the high-fidelity CARLA simulator.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions