Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lun Wang

University of California, Berkeley

Comparative Analysis of Large Language Models for Context-Aware Code Completion using SAFIM Framework

Feb 21, 2025

Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao, Yixian Shen

Abstract:The advent of Large Language Models (LLMs) has revolutionized code completion, transforming it into a more intelligent and context-aware feature in modern integrated development environments. These advancements have significantly enhanced developers' ability to write efficient and error-free code. This study evaluates the performance of several chat-based LLMs, including Gemini 1.5 Flash, Gemini 1.5 Pro, GPT-4o, GPT-4o-mini, and GPT-4 Turbo, using the Syntax-Aware Fill-in-the-Middle (SAFIM) dataset. This benchmark is specifically designed to assess models' capabilities in syntax-sensitive code generation. Performance metrics, such as cosine similarity with ground-truth completions and latency, were employed to measure both accuracy and efficiency. The findings reveal substantial differences in the models' code completion abilities, offering valuable insights into their respective strengths and weaknesses. This work provides a comparative analysis that underscores the trade-offs between accuracy and speed, establishing a benchmark for future advancements in LLM-based code completion.

* 9 pages

Via

Access Paper or Ask Questions

Deep Semantic Graph Learning via LLM based Node Enhancement

Feb 11, 2025

Chuanqi Shi, Yiyi Tao, Hang Zhang, Lun Wang, Shaoshuai Du, Yixian Shen, Yanxin Shen

Abstract:Graph learning has attracted significant attention due to its widespread real-world applications. Current mainstream approaches rely on text node features and obtain initial node embeddings through shallow embedding learning using GNNs, which shows limitations in capturing deep textual semantics. Recent advances in Large Language Models (LLMs) have demonstrated superior capabilities in understanding text semantics, transforming traditional text feature processing. This paper proposes a novel framework that combines Graph Transformer architecture with LLM-enhanced node features. Specifically, we leverage LLMs to generate rich semantic representations of text nodes, which are then processed by a multi-head self-attention mechanism in the Graph Transformer to capture both local and global graph structural information. Our model utilizes the Transformer's attention mechanism to dynamically aggregate neighborhood information while preserving the semantic richness provided by LLM embeddings. Experimental results demonstrate that the LLM-enhanced node features significantly improve the performance of graph learning models on node classification tasks. This approach shows promising results across multiple graph learning tasks, offering a practical direction for combining graph networks with language models.

Via

Access Paper or Ask Questions

AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility

Dec 30, 2024

Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao

Figure 1 for AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility

Figure 2 for AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility

Figure 3 for AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility

Figure 4 for AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility

Abstract:Digital accessibility is a cornerstone of inclusive content delivery, yet many EPUB files fail to meet fundamental accessibility standards, particularly in providing descriptive alt text for images. Alt text plays a critical role in enabling visually impaired users to understand visual content through assistive technologies. However, generating high-quality alt text at scale is a resource-intensive process, creating significant challenges for organizations aiming to ensure accessibility compliance. This paper introduces AltGen, a novel AI-driven pipeline designed to automate the generation of alt text for images in EPUB files. By integrating state-of-the-art generative models, including advanced transformer-based architectures, AltGen achieves contextually relevant and linguistically coherent alt text descriptions. The pipeline encompasses multiple stages, starting with data preprocessing to extract and prepare relevant content, followed by visual analysis using computer vision models such as CLIP and ViT. The extracted visual features are enriched with contextual information from surrounding text, enabling the fine-tuned language models to generate descriptive and accurate alt text. Validation of the generated output employs both quantitative metrics, such as cosine similarity and BLEU scores, and qualitative feedback from visually impaired users. Experimental results demonstrate the efficacy of AltGen across diverse datasets, achieving a 97.5% reduction in accessibility errors and high scores in similarity and linguistic fidelity metrics. User studies highlight the practical impact of AltGen, with participants reporting significant improvements in document usability and comprehension. Furthermore, comparative analyses reveal that AltGen outperforms existing approaches in terms of accuracy, relevance, and scalability.

Via

Access Paper or Ask Questions

Comparative Analysis of Listwise Reranking with Large Language Models in Limited-Resource Language Contexts

Dec 28, 2024

Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao, Yixian Shen, Hang Zhang

Abstract:Large Language Models (LLMs) have demonstrated significant effectiveness across various NLP tasks, including text ranking. This study assesses the performance of large language models (LLMs) in listwise reranking for limited-resource African languages. We compare proprietary models RankGPT3.5, Rank4o-mini, RankGPTo1-mini and RankClaude-sonnet in cross-lingual contexts. Results indicate that these LLMs significantly outperform traditional baseline methods such as BM25-DT in most evaluation metrics, particularly in nDCG@10 and MRR@100. These findings highlight the potential of LLMs in enhancing reranking tasks for low-resource languages and offer insights into cost-effective solutions.

Via

Access Paper or Ask Questions

Robustness of Large Language Models Against Adversarial Attacks

Dec 22, 2024

Yiyi Tao, Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du

Figure 1 for Robustness of Large Language Models Against Adversarial Attacks

Figure 2 for Robustness of Large Language Models Against Adversarial Attacks

Figure 3 for Robustness of Large Language Models Against Adversarial Attacks

Figure 4 for Robustness of Large Language Models Against Adversarial Attacks

Abstract:The increasing deployment of Large Language Models (LLMs) in various applications necessitates a rigorous evaluation of their robustness against adversarial attacks. In this paper, we present a comprehensive study on the robustness of GPT LLM family. We employ two distinct evaluation methods to assess their resilience. The first method introduce character-level text attack in input prompts, testing the models on three sentiment classification datasets: StanfordNLP/IMDB, Yelp Reviews, and SST-2. The second method involves using jailbreak prompts to challenge the safety mechanisms of the LLMs. Our experiments reveal significant variations in the robustness of these models, demonstrating their varying degrees of vulnerability to both character-level and semantic-level adversarial attacks. These findings underscore the necessity for improved adversarial training and enhanced safety mechanisms to bolster the robustness of LLMs.

Via

Access Paper or Ask Questions

RareAgents: Autonomous Multi-disciplinary Team for Rare Disease Diagnosis and Treatment

Dec 17, 2024

Xuanzhong Chen, Ye Jin, Xiaohao Mao, Lun Wang, Shuyang Zhang, Ting Chen

Abstract:Rare diseases, despite their low individual incidence, collectively impact around 300 million people worldwide due to the huge number of diseases. The complexity of symptoms and the shortage of specialized doctors with relevant experience make diagnosing and treating rare diseases more challenging than common diseases. Recently, agents powered by large language models (LLMs) have demonstrated notable improvements across various domains. In the medical field, some agent methods have outperformed direct prompts in question-answering tasks from medical exams. However, current agent frameworks lack adaptation for real-world clinical scenarios, especially those involving the intricate demands of rare diseases. To address these challenges, we present RareAgents, the first multi-disciplinary team of LLM-based agents tailored to the complex clinical context of rare diseases. RareAgents integrates advanced planning capabilities, memory mechanisms, and medical tools utilization, leveraging Llama-3.1-8B/70B as the base model. Experimental results show that RareAgents surpasses state-of-the-art domain-specific models, GPT-4o, and existing agent frameworks in both differential diagnosis and medication recommendation for rare diseases. Furthermore, we contribute a novel dataset, MIMIC-IV-Ext-Rare, derived from MIMIC-IV, to support further advancements in this field.

Via

Access Paper or Ask Questions

Can DeepFake Speech be Reliably Detected?

Oct 09, 2024

Hongbin Liu, Youzheng Chen, Arun Narayanan, Athula Balachandran, Pedro J. Moreno, Lun Wang

Figure 1 for Can DeepFake Speech be Reliably Detected?

Figure 2 for Can DeepFake Speech be Reliably Detected?

Figure 3 for Can DeepFake Speech be Reliably Detected?

Figure 4 for Can DeepFake Speech be Reliably Detected?

Abstract:Recent advances in text-to-speech (TTS) systems, particularly those with voice cloning capabilities, have made voice impersonation readily accessible, raising ethical and legal concerns due to potential misuse for malicious activities like misinformation campaigns and fraud. While synthetic speech detectors (SSDs) exist to combat this, they are vulnerable to ``test domain shift", exhibiting decreased performance when audio is altered through transcoding, playback, or background noise. This vulnerability is further exacerbated by deliberate manipulation of synthetic speech aimed at deceiving detectors. This work presents the first systematic study of such active malicious attacks against state-of-the-art open-source SSDs. White-box attacks, black-box attacks, and their transferability are studied from both attack effectiveness and stealthiness, using both hardcoded metrics and human ratings. The results highlight the urgent need for more robust detection methods in the face of evolving adversarial threats.

Via

Access Paper or Ask Questions

NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training

Sep 15, 2024

Yiyi Tao, Zhuoyue Wang, Hang Zhang, Lun Wang

Figure 1 for NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training

Figure 2 for NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training

Figure 3 for NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training

Figure 4 for NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training

Abstract:The success of Vision Language Models (VLMs) on various vision-language tasks heavily relies on pre-training with large scale web-crawled datasets. However, the noisy and incomplete nature of web data makes dataset scale crucial for performance, rendering end-to-end training increasingly prohibitive. In this paper, we propose NEVLP, a noise-robust framework for efficient vision-language pre-training that requires less pre-training data. Specifically, we bridge the modality gap between a frozen image encoder and a large language model with a transformer and introduce two innovative learning strategies: noise-adaptive learning and concept-enhanced learning to mitigate the impact of noise. In noise-adaptive learning, we estimate the noise probability of each image-text pair based on the transformer's memorization effect and employ noise-adaptive regularization on image-text contrastive learning to condition cross-modal alignment. In concept-enhanced learning, we enrich incomplete text by incorporating visual concepts (objects in the image) to provide prior information about existing objects for image-text matching and image-grounded text generation, thereby mitigating text incompletion. Our framework effectively utilizes noisy web data and achieves state-of-the-art performance with less pre-training data across a wide range of vision-language tasks, including image-text retrieval, image captioning, and visual question answering.

Via

Access Paper or Ask Questions

Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient Manipulation

Aug 29, 2024

Lun Wang

Abstract:Micro-batch clipping, a gradient clipping method, has recently shown potential in enhancing auto-speech recognition (ASR) model performance. However, the underlying mechanism behind this improvement remains mysterious, particularly the observation that only certain micro-batch sizes are beneficial. In this paper, we make the first attempt to explain this phenomenon. Inspired by recent data pruning research, we assume that specific training samples may impede model convergence during certain training phases. Under this assumption, the convergence analysis shows that micro-batch clipping can improve the convergence rate asymptotically at the cost of an additional constant bias that does not diminish with more training iterations. The bias is dependent on a few factors and can be minimized at specific micro-batch size, thereby elucidating the existence of the sweet-spot micro-batch size observed previously. We also verify the effectiveness of micro-batch clipping beyond speech models on vision and language models, and show promising performance gains in these domains. An exploration of potential limitations shows that micro-batch clipping is less effective when training data originates from multiple distinct domains.

Via

Access Paper or Ask Questions

CherryRec: Enhancing News Recommendation Quality via LLM-driven Framework

Jun 18, 2024

Shaohuang Wang, Lun Wang, Yunhan Bu, Tianwei Huang

Figure 1 for CherryRec: Enhancing News Recommendation Quality via LLM-driven Framework

Figure 2 for CherryRec: Enhancing News Recommendation Quality via LLM-driven Framework

Figure 3 for CherryRec: Enhancing News Recommendation Quality via LLM-driven Framework

Figure 4 for CherryRec: Enhancing News Recommendation Quality via LLM-driven Framework

Abstract:Large Language Models (LLMs) have achieved remarkable progress in language understanding and generation. Custom LLMs leveraging textual features have been applied to recommendation systems, demonstrating improvements across various recommendation scenarios. However, most existing methods perform untrained recommendation based on pre-trained knowledge (e.g., movie recommendation), and the auto-regressive generation of LLMs leads to slow inference speeds, making them less effective in real-time recommendations.To address this, we propose a framework for news recommendation using LLMs, named \textit{CherryRec}, which ensures the quality of recommendations while accelerating the recommendation process. Specifically, we employ a Knowledge-aware News Rapid Selector to retrieve candidate options based on the user's interaction history. The history and retrieved items are then input as text into a fine-tuned LLM, the Content-aware News Llm Evaluator, designed to enhance news recommendation capabilities. Finally, the Value-aware News Scorer integrates the scores to compute the CherryRec Score, which serves as the basis for the final recommendation.We validate the effectiveness of the proposed framework by comparing it with state-of-the-art baseline methods on benchmark datasets. Our experimental results consistently show that CherryRec outperforms the baselines in both recommendation performance and efficiency.The project resource can be accessed at: \url{https://github.com/xxxxxx}

Via

Access Paper or Ask Questions