Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chao Wei

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

Jun 06, 2025

Sheng Chen, Peiyu He, Jiaxin Hu, Ziyang Liu, Yansheng Wang, Tao Xu, Chi Zhang, Chongchong Zhang, Chao An, Shiyu Cai(+60 more)

Abstract:Modern robot navigation systems encounter difficulties in diverse and complex indoor environments. Traditional approaches rely on multiple modules with small models or rule-based systems and thus lack adaptability to new environments. To address this, we developed Astra, a comprehensive dual-model architecture, Astra-Global and Astra-Local, for mobile robot navigation. Astra-Global, a multimodal LLM, processes vision and language inputs to perform self and goal localization using a hybrid topological-semantic graph as the global map, and outperforms traditional visual place recognition methods. Astra-Local, a multitask network, handles local path planning and odometry estimation. Its 4D spatial-temporal encoder, trained through self-supervised learning, generates robust 4D features for downstream tasks. The planning head utilizes flow matching and a novel masked ESDF loss to minimize collision risks for generating local trajectories, and the odometry head integrates multi-sensor inputs via a transformer encoder to predict the relative pose of the robot. Deployed on real in-house mobile robots, Astra achieves high end-to-end mission success rate across diverse indoor environments.

* Astra Technical Report

Via

Access Paper or Ask Questions

Low-Complex Waveform, Modulation and Coding Designs for 3GPP Ambient IoT

Jan 15, 2025

Mingxi Yin, Chao Wei, Kazuki Takeda, Yinhua Jia, Changlong Xu, Chengjin Zhang, Hao Xu

Abstract:This paper presents a comprehensive study on low-complexity waveform, modulation and coding (WMC) designs for the 3rd Generation Partnership Project (3GPP) Ambient Internet of Things (A-IoT). A-IoT is a low-cost, low-power IoT system inspired by Ultra High Frequency (UHF) Radio Frequency Identification (RFID) and aims to leverage existing cellular network infrastructure for efficient RF tag management. The paper compares the physical layer (PHY) design challenges and requirements of RFID and A-IoT, particularly focusing on backscatter communications. An overview of the standardization for PHY designs in Release 19 A-IoT is provided, along with detailed schemes of the proposed low-complex WMC designs. The performance of device-to-reader link designs is validated through simulations, demonstrating 6 dB improvements of the proposed baseband waveform with coherent receivers compared to RFID line coding-based solutions with non-coherent receivers when channel coding is adopted.

* This work has been submitted to the IEEE (IEEE Communications Standards Magazine, Special Issue for Ambient IoT) for possible publication

Via

Access Paper or Ask Questions

Preference as Reward, Maximum Preference Optimization with Importance Sampling

Jan 08, 2024

Zaifan Jiang, Xing Huang, Chao Wei

Figure 1 for Preference as Reward, Maximum Preference Optimization with Importance Sampling

Abstract:Preference learning is a key technology for aligning language models with human values. Reinforcement Learning from Human Feedback (RLHF) is a model based algorithm to optimize preference learning, which first fitting a reward model for preference score, and then optimizing generating policy with on-policy PPO algorithm to maximize the reward. The processing of RLHF is complex, time-consuming and unstable. Direct Preference Optimization (DPO) algorithm using off-policy algorithm to direct optimize generating policy and eliminating the need for reward model, which is data efficient and stable. DPO use Bradley-Terry model and log-loss which leads to over-fitting to the preference data at the expense of ignoring KL-regularization term when preference is deterministic. IPO uses a root-finding MSE loss to solve the ignoring KL-regularization problem. In this paper, we'll figure out, although IPO fix the problem when preference is deterministic, but both DPO and IPO fails the KL-regularization term because the support of preference distribution not equal to reference distribution. Then, we design a simple and intuitive off-policy preference optimization algorithm from an importance sampling view, which we call Maximum Preference Optimization (MPO), and add off-policy KL-regularization terms which makes KL-regularization truly effective. The objective of MPO bears resemblance to RLHF's objective, and likes IPO, MPO is off-policy. So, MPO attains the best of both worlds. To simplify the learning process and save memory usage, MPO eliminates the needs for both reward model and reference policy.

Via

Access Paper or Ask Questions

Framework for Quality Evaluation of Smart Roadside Infrastructure Sensors for Automated Driving Applications

Apr 16, 2023

Laurent Kloeker, Chenghua Liu, Chao Wei, Lutz Eckstein

Abstract:The use of smart roadside infrastructure sensors is highly relevant for future applications of connected and automated vehicles. External sensor technology in the form of intelligent transportation system stations (ITS-Ss) can provide safety-critical real-time information about road users in the form of a digital twin. The choice of sensor setups has a major influence on the downstream function as well as the data quality. To date, there is insufficient research on which sensor setups result in which levels of ITS-S data quality. We present a novel approach to perform detailed quality assessment for smart roadside infrastructure sensors. Our framework is multimodal across different sensor types and is evaluated on the DAIR-V2X dataset. We analyze the composition of different lidar and camera sensors and assess them in terms of accuracy, latency, and reliability. The evaluations show that the framework can be used reliably for several future ITS-S applications.

* Accepted to be published as part of the 34th IEEE Intelligent Vehicles Symposium (IV), Anchorage, Alaska, USA, June 4-7, 2023

Via

Access Paper or Ask Questions

ACE-BERT: Adversarial Cross-modal Enhanced BERT for E-commerce Retrieval

Dec 14, 2021

Boxuan Zhang, Chao Wei, Yan Jin, Weiru Zhang

Figure 1 for ACE-BERT: Adversarial Cross-modal Enhanced BERT for E-commerce Retrieval

Figure 2 for ACE-BERT: Adversarial Cross-modal Enhanced BERT for E-commerce Retrieval

Figure 3 for ACE-BERT: Adversarial Cross-modal Enhanced BERT for E-commerce Retrieval

Figure 4 for ACE-BERT: Adversarial Cross-modal Enhanced BERT for E-commerce Retrieval

Abstract:Nowadays on E-commerce platforms, products are presented to the customers with multiple modalities. These multiple modalities are significant for a retrieval system while providing attracted products for customers. Therefore, how to take into account those multiple modalities simultaneously to boost the retrieval performance is crucial. This problem is a huge challenge to us due to the following reasons: (1) the way of extracting patch features with the pre-trained image model (e.g., CNN-based model) has much inductive bias. It is difficult to capture the efficient information from the product image in E-commerce. (2) The heterogeneity of multimodal data makes it challenging to construct the representations of query text and product including title and image in a common subspace. We propose a novel Adversarial Cross-modal Enhanced BERT (ACE-BERT) for efficient E-commerce retrieval. In detail, ACE-BERT leverages the patch features and pixel features as image representation. Thus the Transformer architecture can be applied directly to the raw image sequences. With the pre-trained enhanced BERT as the backbone network, ACE-BERT further adopts adversarial learning by adding a domain classifier to ensure the distribution consistency of different modality representations for the purpose of narrowing down the representation gap between query and product. Experimental results demonstrate that ACE-BERT outperforms the state-of-the-art approaches on the retrieval task. It is remarkable that ACE-BERT has already been deployed in our E-commerce's search engine, leading to 1.46% increase in revenue.

Via

Access Paper or Ask Questions

Learning based Predictive Error Estimation and Compensator Design for Autonomous Vehicle Path Tracking

Jul 18, 2020

Chaoyang Jiang, Hanqing Tian, Jibin Hu, Jiankun Zhai, Chao Wei, Jun Ni

Figure 1 for Learning based Predictive Error Estimation and Compensator Design for Autonomous Vehicle Path Tracking

Figure 2 for Learning based Predictive Error Estimation and Compensator Design for Autonomous Vehicle Path Tracking

Figure 3 for Learning based Predictive Error Estimation and Compensator Design for Autonomous Vehicle Path Tracking

Figure 4 for Learning based Predictive Error Estimation and Compensator Design for Autonomous Vehicle Path Tracking

Abstract:Model predictive control (MPC) is widely used for path tracking of autonomous vehicles due to its ability to handle various types of constraints. However, a considerable predictive error exists because of the error of mathematics model or the model linearization. In this paper, we propose a framework combining the MPC with a learning-based error estimator and a feedforward compensator to improve the path tracking accuracy. An extreme learning machine is implemented to estimate the model based predictive error from vehicle state feedback information. Offline training data is collected from a vehicle controlled by a model-defective regular MPC for path tracking in several working conditions, respectively. The data include vehicle state and the spatial error between the current actual position and the corresponding predictive position. According to the estimated predictive error, we then design a PID-based feedforward compensator. Simulation results via Carsim show the estimation accuracy of the predictive error and the effectiveness of the proposed framework for path tracking of an autonomous vehicle.

* 5 pages, 8 figures,ICIEA 2020 paper

Via

Access Paper or Ask Questions

Optimal Delivery with Budget Constraint in E-Commerce Advertising

Oct 08, 2019

Chao Wei, Weiru Zhang, Shengjie Sun, Fei Li, Xiaonan Meng, Yi Hu, Hao Wang

Figure 1 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Figure 2 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Figure 3 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Figure 4 for Optimal Delivery with Budget Constraint in E-Commerce Advertising

Abstract:Online advertising in E-commerce platforms provides sellers an opportunity to achieve potential audiences with different target goals. Ad serving systems (like display and search advertising systems) that assign ads to pages should satisfy objectives such as plenty of audience for branding advertisers, clicks or conversions for performance-based advertisers, at the same time try to maximize overall revenue of the platform. In this paper, we propose an approach based on linear programming subjects to constraints in order to optimize the revenue and improve different performance goals simultaneously. We have validated our algorithm by implementing an offline simulation system in Alibaba E-commerce platform and running the auctions from online requests which takes system performance, ranking and pricing schemas into account. We have also compared our algorithm with related work, and the results show that our algorithm can effectively improve campaign performance and revenue of the platform.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions

Important Attribute Identification in Knowledge Graph

Oct 12, 2018

Shengjie Sun, Dong Yang, Hongchun Zhang, Yanxu Chen, Chao Wei, Xiaonan Meng, Yi Hu

Figure 1 for Important Attribute Identification in Knowledge Graph

Figure 2 for Important Attribute Identification in Knowledge Graph

Figure 3 for Important Attribute Identification in Knowledge Graph

Abstract:The knowledge graph(KG) composed of entities with their descriptions and attributes, and relationship between entities, is finding more and more application scenarios in various natural language processing tasks. In a typical knowledge graph like Wikidata, entities usually have a large number of attributes, but it is difficult to know which ones are important. The importance of attributes can be a valuable piece of information in various applications spanning from information retrieval to natural language generation. In this paper, we propose a general method of using external user generated text data to evaluate the relative importance of an entity's attributes. To be more specific, we use the word/sub-word embedding techniques to match the external textual data back to entities' attribute name and values and rank the attributes by their matching cohesiveness. To our best knowledge, this is the first work of applying vector based semantic matching to important attribute identification, and our method outperforms the previous traditional methods. We also apply the outcome of the detected important attributes to a language generation task; compared with previous generated text, the new method generates much more customized and informative messages.

Via

Access Paper or Ask Questions