Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yu Han

Keypoint Detection Empowered Near-Field User Localization and Channel Reconstruction

Jan 21, 2025

Mengyuan Li, Yu Han, Zhizheng Lu, Shi Jin, Yongxu Zhu, Chao-Kai Wen

Abstract:In the near-field region of an extremely large-scale multiple-input multiple-output (XL MIMO) system, channel reconstruction is typically addressed through sparse parameter estimation based on compressed sensing (CS) algorithms after converting the received pilot signals into the transformed domain. However, the exhaustive search on the codebook in CS algorithms consumes significant computational resources and running time, particularly when a large number of antennas are equipped at the base station (BS). To overcome this challenge, we propose a novel scheme to replace the high-cost exhaustive search procedure. We visualize the sparse channel matrix in the transformed domain as a channel image and design the channel keypoint detection network (CKNet) to locate the user and scatterers in high speed. Subsequently, we use a small-scale newtonized orthogonal matching pursuit (NOMP) based refiner to further enhance the precision. Our method is applicable to both the Cartesian domain and the Polar domain. Additionally, to deal with scenarios with a flexible number of propagation paths, we further design FlexibleCKNet to predict both locations and confidence scores. Our experimental results validate that the CKNet and FlexibleCKNet-empowered channel reconstruction scheme can significantly reduce the computational complexity while maintaining high accuracy in both user and scatterer localization and channel reconstruction tasks.

Via

Access Paper or Ask Questions

ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction

Dec 15, 2024

Yi Feng, Yu Han, Xijing Zhang, Tanghui Li, Yanting Zhang, Rui Fan

Abstract:Inferring the 3D structure of a scene from a single image is an ill-posed and challenging problem in the field of vision-centric autonomous driving. Existing methods usually employ neural radiance fields to produce voxelized 3D occupancy, lacking instance-level semantic reasoning and temporal photometric consistency. In this paper, we propose ViPOcc, which leverages the visual priors from vision foundation models (VFMs) for fine-grained 3D occupancy prediction. Unlike previous works that solely employ volume rendering for RGB and depth image reconstruction, we introduce a metric depth estimation branch, in which an inverse depth alignment module is proposed to bridge the domain gap in depth distribution between VFM predictions and the ground truth. The recovered metric depth is then utilized in temporal photometric alignment and spatial geometric alignment to ensure accurate and consistent 3D occupancy prediction. Additionally, we also propose a semantic-guided non-overlapping Gaussian mixture sampler for efficient, instance-aware ray sampling, which addresses the redundant and imbalanced sampling issue that still exists in previous state-of-the-art methods. Extensive experiments demonstrate the superior performance of ViPOcc in both 3D occupancy prediction and depth estimation tasks on the KITTI-360 and KITTI Raw datasets. Our code is available at: \url{https://mias.group/ViPOcc}.

* accepted to AAAI25

Via

Access Paper or Ask Questions

Regulator-Manufacturer AI Agents Modeling: Mathematical Feedback-Driven Multi-Agent LLM Framework

Nov 22, 2024

Yu Han, Zekun Guo

Figure 1 for Regulator-Manufacturer AI Agents Modeling: Mathematical Feedback-Driven Multi-Agent LLM Framework

Figure 2 for Regulator-Manufacturer AI Agents Modeling: Mathematical Feedback-Driven Multi-Agent LLM Framework

Figure 3 for Regulator-Manufacturer AI Agents Modeling: Mathematical Feedback-Driven Multi-Agent LLM Framework

Figure 4 for Regulator-Manufacturer AI Agents Modeling: Mathematical Feedback-Driven Multi-Agent LLM Framework

Abstract:The increasing complexity of regulatory updates from global authorities presents significant challenges for medical device manufacturers, necessitating agile strategies to sustain compliance and maintain market access. Concurrently, regulatory bodies must effectively monitor manufacturers' responses and develop strategic surveillance plans. This study employs a multi-agent modeling approach, enhanced with Large Language Models (LLMs), to simulate regulatory dynamics and examine the adaptive behaviors of key actors, including regulatory bodies, manufacturers, and competitors. These agents operate within a simulated environment governed by regulatory flow theory, capturing the impacts of regulatory changes on compliance decisions, market adaptation, and innovation strategies. Our findings illuminate the influence of regulatory shifts on industry behaviour and identify strategic opportunities for improving regulatory practices, optimizing compliance, and fostering innovation. By leveraging the integration of multi-agent systems and LLMs, this research provides a novel perspective and offers actionable insights for stakeholders navigating the evolving regulatory landscape of the medical device industry.

Via

Access Paper or Ask Questions

Channel Customization for Low-Complexity CSI Acquisition in Multi-RIS-Assisted MIMO Systems

Nov 21, 2024

Weicong Chen, Yu Han, Chao-Kai Wen, Xiao Li, Shi Jin

Figure 1 for Channel Customization for Low-Complexity CSI Acquisition in Multi-RIS-Assisted MIMO Systems

Figure 2 for Channel Customization for Low-Complexity CSI Acquisition in Multi-RIS-Assisted MIMO Systems

Figure 3 for Channel Customization for Low-Complexity CSI Acquisition in Multi-RIS-Assisted MIMO Systems

Figure 4 for Channel Customization for Low-Complexity CSI Acquisition in Multi-RIS-Assisted MIMO Systems

Abstract:The deployment of multiple reconfigurable intelligent surfaces (RISs) enhances the propagation environment by improving channel quality, but it also complicates channel estimation. Following the conventional wireless communication system design, which involves full channel state information (CSI) acquisition followed by RIS configuration, can reduce transmission efficiency due to substantial pilot overhead and computational complexity. This study introduces an innovative approach that integrates CSI acquisition and RIS configuration, leveraging the channel-altering capabilities of the RIS to reduce both the overhead and complexity of CSI acquisition. The focus is on multi-RIS-assisted systems, featuring both direct and reflected propagation paths. By applying a fast-varying reflection sequence during RIS configuration for channel training, the complex problem of channel estimation is decomposed into simpler, independent tasks. These fast-varying reflections effectively isolate transmit signals from different paths, streamlining the CSI acquisition process for both uplink and downlink communications with reduced complexity. In uplink scenarios, a positioning-based algorithm derives partial CSI, informing the adjustment of RIS parameters to create a sparse reflection channel, enabling precise reconstruction of the uplink channel. Downlink communication benefits from this strategically tailored reflection channel, allowing effective CSI acquisition with fewer pilot signals. Simulation results highlight the proposed methodology's ability to accurately reconstruct the reflection channel with minimal impact on the normalized mean square error while simultaneously enhancing spectral efficiency.

* Accepted by IEEE JSAC special issue on Next Generation Advanced Transceiver Technologies

Via

Access Paper or Ask Questions

The Use of Readability Metrics in Legal Text: A Systematic Literature Review

Nov 14, 2024

Yu Han, Aaron Ceross, Jeroen H. M. Bergmann

Figure 1 for The Use of Readability Metrics in Legal Text: A Systematic Literature Review

Figure 2 for The Use of Readability Metrics in Legal Text: A Systematic Literature Review

Figure 3 for The Use of Readability Metrics in Legal Text: A Systematic Literature Review

Figure 4 for The Use of Readability Metrics in Legal Text: A Systematic Literature Review

Abstract:Understanding the text in legal documents can be challenging due to their complex structure and the inclusion of domain-specific jargon. Laws and regulations are often crafted in such a manner that engagement with them requires formal training, potentially leading to vastly different interpretations of the same texts. Linguistic complexity is an important contributor to the difficulties experienced by readers. Simplifying texts could enhance comprehension across a broader audience, not just among trained professionals. Various metrics have been developed to measure document readability. Therefore, we adopted a systematic review approach to examine the linguistic and readability metrics currently employed for legal and regulatory texts. A total of 3566 initial papers were screened, with 34 relevant studies found and further assessed. Our primary objective was to identify which current metrics were applied for evaluating readability within the legal field. Sixteen different metrics were identified, with the Flesch-Kincaid Grade Level being the most frequently used method. The majority of studies (73.5%) were found in the domain of "informed consent forms". From the analysis, it is clear that not all legal domains are well represented in terms of readability metrics and that there is a further need to develop more consensus on which metrics should be applied for legal documents.

Via

Access Paper or Ask Questions

Data-Driven Analysis of AI in Medical Device Software in China: Deep Learning and General AI Trends Based on Regulatory Data

Nov 11, 2024

Yu Han, Aaron Ceross, Sarim Ather, Jeroen H. M. Bergmann

Figure 1 for Data-Driven Analysis of AI in Medical Device Software in China: Deep Learning and General AI Trends Based on Regulatory Data

Figure 2 for Data-Driven Analysis of AI in Medical Device Software in China: Deep Learning and General AI Trends Based on Regulatory Data

Figure 3 for Data-Driven Analysis of AI in Medical Device Software in China: Deep Learning and General AI Trends Based on Regulatory Data

Figure 4 for Data-Driven Analysis of AI in Medical Device Software in China: Deep Learning and General AI Trends Based on Regulatory Data

Abstract:Artificial intelligence (AI) in medical device software (MDSW) represents a transformative clinical technology, attracting increasing attention within both the medical community and the regulators. In this study, we leverage a data-driven approach to automatically extract and analyze AI-enabled medical devices (AIMD) from the National Medical Products Administration (NMPA) regulatory database. The continued increase in publicly available regulatory data requires scalable methods for analysis. Automation of regulatory information screening is essential to create reproducible insights that can be quickly updated in an ever changing medical device landscape. More than 4 million entries were assessed, identifying 2,174 MDSW registrations, including 531 standalone applications and 1,643 integrated within medical devices, of which 43 were AI-enabled. It was shown that the leading medical specialties utilizing AIMD include respiratory (20.5%), ophthalmology/endocrinology (12.8%), and orthopedics (10.3%). This approach greatly improves the speed of data extracting providing a greater ability to compare and contrast. This study provides the first extensive, data-driven exploration of AIMD in China, showcasing the potential of automated regulatory data analysis in understanding and advancing the landscape of AI in medical technology.

Via

Access Paper or Ask Questions

Foundation Models in Electrocardiogram: A Review

Oct 24, 2024

Yu Han, Cheng Ding

Abstract:The electrocardiogram (ECG) is ubiquitous across various healthcare domains, such as cardiac arrhythmia detection and sleep monitoring, making ECG analysis critically essential. Traditional deep learning models for ECG are task-specific, with a narrow scope of functionality and limited generalization capabilities. Recently, foundation models (FMs), also known as large pre-training models, have fundamentally reshaped the scheme of model design and representation learning, enhancing the performance across a variety of downstream tasks. This success has drawn interest in the exploration of FMs to address ECG-based medical challenges concurrently. This survey provides a timely, comprehensive and up-to-date overview of FMs for large-scale ECG-FMs. First, we offer a brief background introduction to FMs. Then, we discuss the model architectures, pre-training methods, and adaptation approaches of ECG-FMs from a methodology perspective. Despite the promising opportunities of ECG-FMs, we also outline the challenges and potential future directions. Overall, this survey aims to provide researchers and practitioners with insights into the research of ECG-FMs on theoretical underpinnings, domain-specific applications, and avenues for future exploration.

Via

Access Paper or Ask Questions

aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Completion

Oct 17, 2024

Siyuan Jiang, Jia Li, He Zong, Huanyu Liu, Hao Zhu, Shukai Hu, Erlu Li, Jiazheng Ding, Yu Han, Wei Ning(+1 more)

Abstract:Large Language Models (LLMs) have been widely used in code completion, and researchers are focusing on scaling up LLMs to improve their accuracy. However, larger LLMs will increase the response time of code completion and decrease the developers' productivity. In this paper, we propose a lightweight and effective LLM for code completion named aiXcoder-7B. Compared to existing LLMs, aiXcoder-7B achieves higher code completion accuracy while having smaller scales (i.e., 7 billion parameters). We attribute the superiority of aiXcoder-7B to three key factors: (1) Multi-objective training. We employ three training objectives, one of which is our proposed Structured Fill-In-the-Middle (SFIM). SFIM considers the syntax structures in code and effectively improves the performance of LLMs for code. (2) Diverse data sampling strategies. They consider inter-file relationships and enhance the capability of LLMs in understanding cross-file contexts. (3) Extensive high-quality data. We establish a rigorous data collection pipeline and consume a total of 1.2 trillion unique tokens for training aiXcoder-7B. This vast volume of data enables aiXcoder-7B to learn a broad distribution of code. We evaluate aiXcoder-7B in five popular code completion benchmarks and a new benchmark collected by this paper. The results show that aiXcoder-7B outperforms the latest six LLMs with similar sizes and even surpasses four larger LLMs (e.g., StarCoder2-15B and CodeLlama-34B), positioning aiXcoder-7B as a lightweight and effective LLM for academia and industry. Finally, we summarize three valuable insights for helping practitioners train the next generations of LLMs for code. aiXcoder-7B has been open-souced and gained significant attention. As of the submission date, aiXcoder-7B has received 2,193 GitHub Stars.

* aiXcoder-7B is available at https://github.com/aixcoder-plugin/aiXcoder-7B/tree/main

Via

Access Paper or Ask Questions

Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning

Sep 12, 2024

Teng Yan, Zhendong Ruan, Yaobang Cai, Yu Han, Wenxian Li, Yang Zhang

Figure 1 for Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning

Figure 2 for Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning

Figure 3 for Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning

Figure 4 for Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning

Abstract:As a data-driven paradigm, offline reinforcement learning (Offline RL) has been formulated as sequence modeling, where the Decision Transformer (DT) has demonstrated exceptional capabilities. Unlike previous reinforcement learning methods that fit value functions or compute policy gradients, DT adjusts the autoregressive model based on the expected returns, past states, and actions, using a causally masked Transformer to output the optimal action. However, due to the inconsistency between the sampled returns within a single trajectory and the optimal returns across multiple trajectories, it is challenging to set an expected return to output the optimal action and stitch together suboptimal trajectories. Decision ConvFormer (DC) is easier to understand in the context of modeling RL trajectories within a Markov Decision Process compared to DT. We propose the Q-value Regularized Decision ConvFormer (QDC), which combines the understanding of RL trajectories by DC and incorporates a term that maximizes action values using dynamic programming methods during training. This ensures that the expected returns of the sampled actions are consistent with the optimal returns. QDC achieves excellent performance on the D4RL benchmark, outperforming or approaching the optimal level in all tested environments. It particularly demonstrates outstanding competitiveness in trajectory stitching capability.

Via

Access Paper or Ask Questions

Mid-Band Extra Large-Scale MIMO System: Channel Modeling and Performance Analysis

Aug 20, 2024

Jiachen Tian, Yu Han, Xiao Li, Shi Jin, Chao-Kai Wen

Figure 1 for Mid-Band Extra Large-Scale MIMO System: Channel Modeling and Performance Analysis

Figure 2 for Mid-Band Extra Large-Scale MIMO System: Channel Modeling and Performance Analysis

Figure 3 for Mid-Band Extra Large-Scale MIMO System: Channel Modeling and Performance Analysis

Figure 4 for Mid-Band Extra Large-Scale MIMO System: Channel Modeling and Performance Analysis

Abstract:In pursuit of enhanced quality of service and higher transmission rates, communication within the mid-band spectrum, such as bands in the 6-15 GHz range, combined with extra large-scale multiple-input multiple-output (XL-MIMO), is considered a potential enabler for future communication systems. However, the characteristics introduced by mid-band XL-MIMO systems pose challenges for channel modeling and performance analysis. In this paper, we first analyze the potential characteristics of mid-band MIMO channels. Then, an analytical channel model incorporating novel channel characteristics is proposed, based on a review of classical analytical channel models. This model is convenient for theoretical analysis and compatible with other analytical channel models. Subsequently, based on the proposed channel model, we analyze key metrics of wireless communication, including the ergodic spectral efficiency (SE) and outage probability (OP) of MIMO maximal-ratio combining systems. Specifically, we derive closed-form approximations and performance bounds for two typical scenarios, aiming to illustrate the influence of mid-band XL-MIMO systems. Finally, comparisons between systems under different practical configurations are carried out through simulations. The theoretical analysis and simulations demonstrate that mid-band XL-MIMO systems excel in SE and OP due to the increased array elements, moderate large-scale fading, and enlarged transmission bandwidth.

* 16 pages, 10 figures

Via

Access Paper or Ask Questions