Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hai Lin

Victor

LexSemBridge: Fine-Grained Dense Representation Enhancement through Token-Aware Embedding Augmentation

Aug 25, 2025

Shaoxiong Zhan, Hai Lin, Hongming Tan, Xiaodong Cai, Hai-Tao Zheng, Xin Su, Zifei Shan, Ruitong Liu, Hong-Gee Kim

Abstract:As queries in retrieval-augmented generation (RAG) pipelines powered by large language models (LLMs) become increasingly complex and diverse, dense retrieval models have demonstrated strong performance in semantic matching. Nevertheless, they often struggle with fine-grained retrieval tasks, where precise keyword alignment and span-level localization are required, even in cases with high lexical overlap that would intuitively suggest easier retrieval. To systematically evaluate this limitation, we introduce two targeted tasks, keyword retrieval and part-of-passage retrieval, designed to simulate practical fine-grained scenarios. Motivated by these observations, we propose LexSemBridge, a unified framework that enhances dense query representations through fine-grained, input-aware vector modulation. LexSemBridge constructs latent enhancement vectors from input tokens using three paradigms: Statistical (SLR), Learned (LLR), and Contextual (CLR), and integrates them with dense embeddings via element-wise interaction. Theoretically, we show that this modulation preserves the semantic direction while selectively amplifying discriminative dimensions. LexSemBridge operates as a plug-in without modifying the backbone encoder and naturally extends to both text and vision modalities. Extensive experiments across semantic and fine-grained retrieval tasks validate the effectiveness and generality of our approach. All code and models are publicly available at https://github.com/Jasaxion/LexSemBridge/

Via

Access Paper or Ask Questions

Traffic-Aware Pedestrian Intention Prediction

Jul 16, 2025

Fahimeh Orvati Nia, Hai Lin

Abstract:Accurate pedestrian intention estimation is crucial for the safe navigation of autonomous vehicles (AVs) and hence attracts a lot of research attention. However, current models often fail to adequately consider dynamic traffic signals and contextual scene information, which are critical for real-world applications. This paper presents a Traffic-Aware Spatio-Temporal Graph Convolutional Network (TA-STGCN) that integrates traffic signs and their states (Red, Yellow, Green) into pedestrian intention prediction. Our approach introduces the integration of dynamic traffic signal states and bounding box size as key features, allowing the model to capture both spatial and temporal dependencies in complex urban environments. The model surpasses existing methods in accuracy. Specifically, TA-STGCN achieves a 4.75% higher accuracy compared to the baseline model on the PIE dataset, demonstrating its effectiveness in improving pedestrian intention prediction.

* 6 pages, 4 figures. Accepted to the American Control Conference (ACC) 2025

Via

Access Paper or Ask Questions

Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models

May 05, 2025

Yankai Jiang, Peng Zhang, Donglin Yang, Yuan Tian, Hai Lin, Xiaosong Wang

Abstract:We explore Generalizable Tumor Segmentation, aiming to train a single model for zero-shot tumor segmentation across diverse anatomical regions. Existing methods face limitations related to segmentation quality, scalability, and the range of applicable imaging modalities. In this paper, we uncover the potential of the internal representations within frozen medical foundation diffusion models as highly efficient zero-shot learners for tumor segmentation by introducing a novel framework named DiffuGTS. DiffuGTS creates anomaly-aware open-vocabulary attention maps based on text prompts to enable generalizable anomaly segmentation without being restricted by a predefined training category list. To further improve and refine anomaly segmentation masks, DiffuGTS leverages the diffusion model, transforming pathological regions into high-quality pseudo-healthy counterparts through latent space inpainting, and applies a novel pixel-level and feature-level residual learning approach, resulting in segmentation masks with significantly enhanced quality and generalization. Comprehensive experiments on four datasets and seven tumor categories demonstrate the superior performance of our method, surpassing current state-of-the-art models across multiple zero-shot settings. Codes are available at https://github.com/Yankai96/DiffuGTS.

* This paper is accepted to CVPR 2025

Via

Access Paper or Ask Questions

A Primer on Orthogonal Delay-Doppler Division Multiplexing (ODDM)

Apr 15, 2025

Hai Lin

Figure 1 for A Primer on Orthogonal Delay-Doppler Division Multiplexing (ODDM)

Figure 2 for A Primer on Orthogonal Delay-Doppler Division Multiplexing (ODDM)

Figure 3 for A Primer on Orthogonal Delay-Doppler Division Multiplexing (ODDM)

Figure 4 for A Primer on Orthogonal Delay-Doppler Division Multiplexing (ODDM)

Abstract:As a new type of multicarrier (MC) scheme built upon the recently discovered delay-Doppler domain orthogonal pulse (DDOP), orthogonal delay-Doppler division multiplexing (ODDM) aims to address the challenges of waveform design in linear time-varying channels. In this paper, we explore the design principles of ODDM and clarify the key ideas underlying the DDOP. We then derive an alternative representation of the DDOP and highlight the fundamental differences between ODDM and conventional MC schemes. Finally, we discuss and compare two implementation methods for ODDM.

* The supplementary materials for the ODDM waveform are available at: https://oddm.io

Via

Access Paper or Ask Questions

Graph Neural Network-Based Distributed Optimal Control for Linear Networked Systems: An Online Distributed Training Approach

Apr 08, 2025

Zihao Song, Panos J. Antsaklis, Hai Lin

Abstract:In this paper, we consider the distributed optimal control problem for linear networked systems. In particular, we are interested in learning distributed optimal controllers using graph recurrent neural networks (GRNNs). Most of the existing approaches result in centralized optimal controllers with offline training processes. However, as the increasing demand of network resilience, the optimal controllers are further expected to be distributed, and are desirable to be trained in an online distributed fashion, which are also the main contributions of our work. To solve this problem, we first propose a GRNN-based distributed optimal control method, and we cast the problem as a self-supervised learning problem. Then, the distributed online training is achieved via distributed gradient computation, and inspired by the (consensus-based) distributed optimization idea, a distributed online training optimizer is designed. Furthermore, the local closed-loop stability of the linear networked system under our proposed GRNN-based controller is provided by assuming that the nonlinear activation function of the GRNN-based controller is both local sector-bounded and slope-restricted. The effectiveness of our proposed method is illustrated by numerical simulations using a specifically developed simulator.

* 8 pages, 3 figures

Via

Access Paper or Ask Questions

An Advantage-based Optimization Method for Reinforcement Learning in Large Action Space

Dec 17, 2024

Hai Lin, Cheng Huang, Zhihong Chen

Figure 1 for An Advantage-based Optimization Method for Reinforcement Learning in Large Action Space

Figure 2 for An Advantage-based Optimization Method for Reinforcement Learning in Large Action Space

Figure 3 for An Advantage-based Optimization Method for Reinforcement Learning in Large Action Space

Figure 4 for An Advantage-based Optimization Method for Reinforcement Learning in Large Action Space

Abstract:Reinforcement learning tasks in real-world scenarios often involve large, high-dimensional action spaces, leading to challenges such as convergence difficulties, instability, and high computational complexity. It is widely acknowledged that traditional value-based reinforcement learning algorithms struggle to address these issues effectively. A prevalent approach involves generating independent sub-actions within each dimension of the action space. However, this method introduces bias, hindering the learning of optimal policies. In this paper, we propose an advantage-based optimization method and an algorithm named Advantage Branching Dueling Q-network (ABQ). ABQ incorporates a baseline mechanism to tune the action value of each dimension, leveraging the advantage relationship across different sub-actions. With this approach, the learned policy can be optimized for each dimension. Empirical results demonstrate that ABQ outperforms BDQ, achieving 3%, 171%, and 84% more cumulative rewards in HalfCheetah, Ant, and Humanoid environments, respectively. Furthermore, ABQ exhibits competitive performance when compared against two continuous action benchmark algorithms, DDPG and TD3.

Via

Access Paper or Ask Questions

On the Time-Frequency Localization Characteristics of the Delay-Doppler Plane Orthogonal Pulse

Dec 14, 2024

Akram Shafie, Jinhong Yuan, Nan Yang, Hai Lin

Figure 1 for On the Time-Frequency Localization Characteristics of the Delay-Doppler Plane Orthogonal Pulse

Figure 2 for On the Time-Frequency Localization Characteristics of the Delay-Doppler Plane Orthogonal Pulse

Figure 3 for On the Time-Frequency Localization Characteristics of the Delay-Doppler Plane Orthogonal Pulse

Figure 4 for On the Time-Frequency Localization Characteristics of the Delay-Doppler Plane Orthogonal Pulse

Abstract:In this work, we study the time-frequency (TF) localization characteristics of the prototype pulse of orthogonal delay-Doppler (DD) division multiplexing modulation, namely, the DD plane orthogonal pulse (DDOP). The TF localization characteristics examine how concentrated or spread out the energy of a pulse is in the joint TF domain, the time domain (TD), and the frequency domain (FD). We first derive the TF localization metrics of the DDOP, including its TF area, its time and frequency dispersions, and its direction parameter. Based on these results, we demonstrate that the DDOP exhibits a high energy spread in the TD, FD, and the joint TF domain, while adhering to the Heisenberg uncertainty principle. Thereafter, we discuss the potential advantages brought by the energy spread of the DDOP, especially with regard to harnessing both time and frequency diversities and enabling fine-resolution sensing. Subsequently, we examine the relationships between the time and frequency dispersions of the DDOP and those of the envelope functions of DDOP's TD and FD representations, paving the way for simplified determination of the TF localization metrics for more generalized variants of the DDOP and the pulses used in other DD domain modulation schemes. Finally, using numerical results, we validate our analysis and find further insights.

* This paper has been accepted for publication in an IEEE Journal

Via

Access Paper or Ask Questions

Assessing data-driven predictions of band gap and electrical conductivity for transparent conducting materials

Nov 21, 2024

Federico Ottomano, John Y. Goulermas, Vladimir Gusev, Rahul Savani, Michael W. Gaultois, Troy D. Manning, Hai Lin, Teresa P. Manzanera, Emmeline G. Poole, Matthew S. Dyer(+7 more)

Abstract:Machine Learning (ML) has offered innovative perspectives for accelerating the discovery of new functional materials, leveraging the increasing availability of material databases. Despite the promising advances, data-driven methods face constraints imposed by the quantity and quality of available data. Moreover, ML is often employed in tandem with simulated datasets originating from density functional theory (DFT), and assessed through in-sample evaluation schemes. This scenario raises questions about the practical utility of ML in uncovering new and significant material classes for industrial applications. Here, we propose a data-driven framework aimed at accelerating the discovery of new transparent conducting materials (TCMs), an important category of semiconductors with a wide range of applications. To mitigate the shortage of available data, we create and validate unique experimental databases, comprising several examples of existing TCMs. We assess state-of-the-art (SOTA) ML models for property prediction from the stoichiometry alone. We propose a bespoke evaluation scheme to provide empirical evidence on the ability of ML to uncover new, previously unseen materials of interest. We test our approach on a list of 55 compositions containing typical elements of known TCMs. Although our study indicates that ML tends to identify new TCMs compositionally similar to those in the training data, we empirically demonstrate that it can highlight material candidates that may have been previously overlooked, offering a systematic approach to identify materials that are likely to display TCMs characteristics.

Via

Access Paper or Ask Questions

Performance of orthogonal delay-doppler division multiplexing modulation with imperfect channel estimation

Oct 23, 2024

Kehan Huang, Min Qiu, Jun Tong, Jinhong Yuan, Hai Lin

Abstract:The orthogonal delay-Doppler division multiplexing (ODDM) modulation is a recently proposed multi-carrier modulation that features a realizable pulse orthogonal with respect to the delay-Doppler (DD) plane's fine resolutions. In this paper, we investigate the performance of ODDM systems with imperfect channel estimation considering three detectors, namely the message passing algorithm (MPA) detector, iterative maximum-ratio combining (MRC) detector, and successive interference cancellation with minimum mean square error (SIC-MMSE) detector. We derive the post-equalization signal-to-interference-plus-noise ratio (SINR) for MRC and SIC-MMSE and analyze their bit error rate (BER) performance. Based on this analysis, we propose the MRC with subtractive dither (MRC-SD) and soft SIC-MMSE initialized MRC (SSMI-MRC) detector to improve the BER of iterative MRC. Our results demonstrate that soft SIC-MMSE consistently outperforms the other detectors in BER performance under perfect and imperfect CSI. While MRC exhibits a BER floor above $10^{-5}$, MRC-SD effectively lowers the BER with a negligible increase in detection complexity. SSMI-MRC achieves better BER than hard SIC-MMSE with the same detection complexity order. Additionally, we show that MPA has an error floor and is sensitive to imperfect CSI.

Via

Access Paper or Ask Questions

Perception Compressor:A training-free prompt compression method in long context scenarios

Sep 28, 2024

Jiwei Tang, Jin Xu, Tingwei Lu, Hai Lin, Yiming Zhao, Hai-Tao Zheng

Figure 1 for Perception Compressor:A training-free prompt compression method in long context scenarios

Figure 2 for Perception Compressor:A training-free prompt compression method in long context scenarios

Figure 3 for Perception Compressor:A training-free prompt compression method in long context scenarios

Figure 4 for Perception Compressor:A training-free prompt compression method in long context scenarios

Abstract:Large Language Models (LLMs) demonstrate exceptional capabilities in various scenarios. However, they suffer from much redundant information and tend to be lost in the middle in long context scenarios, leading to inferior performance. To address these challenges, we present Perception Compressor, a training-free prompt compression method. It includes a dual-slope ratio allocator to dynamically assign compression ratios and open-book ratios, a perception retriever that leverages guiding questions and instruction to retrieve the most relevant demonstrations, and a semi-guided iterative compression that retains key information at the token level while removing tokens that distract the LLM. We conduct extensive experiments on long context benchmarks, i.e., NaturalQuestions, LongBench, and MuSiQue. Experiment results show that Perception Compressor outperforms existing methods by a large margin, achieving state-of-the-art performance.

* 9 pages, 2 figures

Via

Access Paper or Ask Questions