Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hao-Yuan Chen

Token Constraint Decoding Improves Robustness on Question Answering for Large Language Models

Jun 11, 2025

Jui-Ming Yao, Hao-Yuan Chen, Zi-Xian Tang, Bing-Jia Tan, Sheng-Wei Peng, Bing-Cheng Xie, Shun-Feng Su

Abstract:Large Language Models (LLMs) have demonstrated impressive performance on multiple-choice question answering (MCQA) benchmarks, yet they remain highly vulnerable to minor input perturbations. In this paper, we introduce and evaluate Token Constraint Decoding (TCD). This simple yet effective inference-time algorithm enforces alignment between token-level predictions to enhance robustness in noisy settings. Through extensive experiments on CommonsenseQA, MMLU, and MMLU-Pro, we show that TCD, especially when paired with prompt engineering (PE) fixes, significantly restores performance degraded by input noise, yielding up to +39\% absolute gains for weaker models like Gemma3 1B. Penalty sweep analyses further reveal that TCD implicitly regularizes overconfident outputs, with different models requiring distinct penalty schedules to maximize resilience. Our findings establish TCD as a practical, model-agnostic approach for improving reasoning stability under real-world imperfections and pave the way for more reliable deployment of LLMs in safety-critical or user-facing applications.

Via

Access Paper or Ask Questions

Delta - Contrastive Decoding Mitigates Text Hallucinations in Large Language Models

Feb 09, 2025

Cheng Peng Huang, Hao-Yuan Chen

Abstract:Large language models (LLMs) demonstrate strong capabilities in natural language processing but remain prone to hallucinations, generating factually incorrect or fabricated content. This issue undermines their reliability, particularly in high-stakes domains such as healthcare and legal advisory. To address this challenge, we propose Delta, an inference-time method that reduces hallucinations without requiring model retraining or additional data. Delta works by randomly masking parts of the input prompt and contrasting the output distributions for the original and masked inputs, effectively suppressing hallucinations through inference-only computations. We evaluate Delta on context-rich question-answering benchmarks, achieving absolute improvements of approximately 3 and 6 percentage points on SQuAD v1.1 and v2, respectively, and 7 and 2 percentage points on TriviaQA and Natural Questions under-sampling decoding. Delta also improves the no-answer exact match score on SQuAD v2 by over ten percentage points, demonstrating its effectiveness in mitigating hallucinations arising from contextual ambiguity. These results highlight Delta as a computationally efficient and scalable approach for improving the reliability of LLMs in real-world applications.

Via

Access Paper or Ask Questions

Quantum Embedding with Transformer for High-dimensional Data

Feb 20, 2024

Hao-Yuan Chen, Yen-Jui Chang, Shih-Wei Liao, Ching-Ray Chang

Abstract:Quantum embedding with transformers is a novel and promising architecture for quantum machine learning to deliver exceptional capability on near-term devices or simulators. The research incorporated a vision transformer (ViT) to advance quantum significantly embedding ability and results for a single qubit classifier with around 3 percent in the median F1 score on the BirdCLEF-2021, a challenging high-dimensional dataset. The study showcases and analyzes empirical evidence that our transformer-based architecture is a highly versatile and practical approach to modern quantum machine learning problems.

Via

Access Paper or Ask Questions

Hybrid Quantum Neural Network in High-dimensional Data Classification

Dec 02, 2023

Hao-Yuan Chen, Yen-Jui Chang, Shih-Wei Liao, Ching-Ray Chang

Abstract:The research explores the potential of quantum deep learning models to address challenging machine learning problems that classical deep learning models find difficult to tackle. We introduce a novel model architecture that combines classical convolutional layers with a quantum neural network, aiming to surpass state-of-the-art accuracy while maintaining a compact model size. The experiment is to classify high-dimensional audio data from the Bird-CLEF 2021 dataset. Our evaluation focuses on key metrics, including training duration, model accuracy, and total model size. This research demonstrates the promising potential of quantum machine learning in enhancing machine learning tasks and solving practical machine learning challenges available today.

Via

Access Paper or Ask Questions

Deep Reinforcement Learning Using Hybrid Quantum Neural Network

Apr 20, 2023

Hao-Yuan Chen

Abstract:Quantum computation has a strong implication for advancing the current limitation of machine learning algorithms to deal with higher data dimensions or reducing the overall training parameters for a deep neural network model. Based on a gate-based quantum computer, a parameterized quantum circuit was designed to solve a model-free reinforcement learning problem with the deep-Q learning method. This research has investigated and evaluated its potential. Therefore, a novel PQC based on the latest Qiskit and PyTorch framework was designed and trained to compare with a full-classical deep neural network with and without integrated PQC. At the end of the research, the research draws its conclusion and prospects on developing deep quantum learning in solving a maze problem or other reinforcement learning problems.

Via

Access Paper or Ask Questions