Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wataru Hashimoto

Identifying Influential N-grams in Confidence Calibration via Regression Analysis

Apr 07, 2026

Shintaro Ozaki, Wataru Hashimoto, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

Abstract:While large language models (LLMs) improve performance by explicit reasoning, their responses are often overconfident, even though they include linguistic expressions demonstrating uncertainty. In this work, we identify what linguistic expressions are related to confidence by applying the regression method. Specifically, we predict confidence of those linguistic expressions in the reasoning parts of LLMs as the dependent variables and analyze the relationship between a specific $n$-gram and confidence. Across multiple models and QA benchmarks, we show that LLMs remain overconfident when reasoning is involved and attribute this behavior to specific linguistic information. Interestingly, several of the extracted expressions coincide with cue phrases intentionally inserted on test-time scaling to improve reasoning performance. Through our test on causality and verification that the extracted linguistic information truly affects confidence, we reveal that confidence calibration is possible by simply suppressing those overconfident expressions without drops in performance.

Via

Access Paper or Ask Questions

Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks

Jul 02, 2024

Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe

Abstract:Trustworthy prediction in Deep Neural Networks (DNNs), including Pre-trained Language Models (PLMs) is important for safety-critical applications in the real world. However, DNNs often suffer from uncertainty estimation, such as miscalibration. In particular, approaches that require multiple stochastic inference can mitigate this problem, but the expensive cost of inference makes them impractical. In this study, we propose $k$-Nearest Neighbor Uncertainty Estimation ($k$NN-UE), which is an uncertainty estimation method that uses the distances from the neighbors and label-existence ratio of neighbors. Experiments on sentiment analysis, natural language inference, and named entity recognition show that our proposed method outperforms the baselines or recent density-based methods in confidence calibration, selective prediction, and out-of-distribution detection. Moreover, our analyses indicate that introducing dimension reduction or approximate nearest neighbor search inspired by recent $k$NN-LM studies reduces the inference overhead without significantly degrading estimation performance when combined them appropriately.

Via

Access Paper or Ask Questions

Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Jul 02, 2024

Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe

Figure 1 for Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Figure 2 for Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Figure 3 for Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Figure 4 for Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?

Abstract:This work investigates the impact of data augmentation on confidence calibration and uncertainty estimation in Named Entity Recognition (NER) tasks. For the future advance of NER in safety-critical fields like healthcare and finance, it is essential to achieve accurate predictions with calibrated confidence when applying Deep Neural Networks (DNNs), including Pre-trained Language Models (PLMs), as a real-world application. However, DNNs are prone to miscalibration, which limits their applicability. Moreover, existing methods for calibration and uncertainty estimation are computational expensive. Our investigation in NER found that data augmentation improves calibration and uncertainty in cross-genre and cross-lingual setting, especially in-domain setting. Furthermore, we showed that the calibration for NER tends to be more effective when the perplexity of the sentences generated by data augmentation is lower, and that increasing the size of the augmentation further improves calibration and uncertainty.

Via

Access Paper or Ask Questions

Long-term Safe Reinforcement Learning with Binary Feedback

Jan 11, 2024

Akifumi Wachi, Wataru Hashimoto, Kazumune Hashimoto

Figure 1 for Long-term Safe Reinforcement Learning with Binary Feedback

Figure 2 for Long-term Safe Reinforcement Learning with Binary Feedback

Figure 3 for Long-term Safe Reinforcement Learning with Binary Feedback

Figure 4 for Long-term Safe Reinforcement Learning with Binary Feedback

Abstract:Safety is an indispensable requirement for applying reinforcement learning (RL) to real problems. Although there has been a surge of safe RL algorithms proposed in recent years, most existing work typically 1) relies on receiving numeric safety feedback; 2) does not guarantee safety during the learning process; 3) limits the problem to a priori known, deterministic transition dynamics; and/or 4) assume the existence of a known safe policy for any states. Addressing the issues mentioned above, we thus propose Long-term Binaryfeedback Safe RL (LoBiSaRL), a safe RL algorithm for constrained Markov decision processes (CMDPs) with binary safety feedback and an unknown, stochastic state transition function. LoBiSaRL optimizes a policy to maximize rewards while guaranteeing a long-term safety that an agent executes only safe state-action pairs throughout each episode with high probability. Specifically, LoBiSaRL models the binary safety function via a generalized linear model (GLM) and conservatively takes only a safe action at every time step while inferring its effect on future safety under proper assumptions. Our theoretical results show that LoBiSaRL guarantees the long-term safety constraint, with high probability. Finally, our empirical results demonstrate that our algorithm is safer than existing methods without significantly compromising performance in terms of reward.

* Accepted to AAAI-24

Via

Access Paper or Ask Questions

Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

Oct 05, 2023

Akifumi Wachi, Wataru Hashimoto, Xun Shen, Kazumune Hashimoto

Figure 1 for Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

Figure 2 for Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

Figure 3 for Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

Figure 4 for Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

Abstract:Safe exploration is essential for the practical use of reinforcement learning (RL) in many real-world scenarios. In this paper, we present a generalized safe exploration (GSE) problem as a unified formulation of common safe exploration problems. We then propose a solution of the GSE problem in the form of a meta-algorithm for safe exploration, MASE, which combines an unconstrained RL algorithm with an uncertainty quantifier to guarantee safety in the current episode while properly penalizing unsafe explorations before actual safety violation to discourage them in future episodes. The advantage of MASE is that we can optimize a policy while guaranteeing with a high probability that no safety constraint will be violated under proper assumptions. Specifically, we present two variants of MASE with different constructions of the uncertainty quantifier: one based on generalized linear models with theoretical guarantees of safety and near-optimality, and another that combines a Gaussian process to ensure safety with a deep RL algorithm to maximize the reward. Finally, we demonstrate that our proposed algorithm achieves better performance than state-of-the-art algorithms on grid-world and Safety Gym benchmarks without violating any safety constraints, even during training.

* Accepted to NeurIPS 2023

Via

Access Paper or Ask Questions

Neural Controller Synthesis for Signal Temporal Logic Specifications Using Encoder-Decoder Structured Networks

Dec 10, 2022

Wataru Hashimoto, Kazumune Hashimoto, Masako Kishida, Shigemasa Takai

Figure 1 for Neural Controller Synthesis for Signal Temporal Logic Specifications Using Encoder-Decoder Structured Networks

Figure 2 for Neural Controller Synthesis for Signal Temporal Logic Specifications Using Encoder-Decoder Structured Networks

Figure 3 for Neural Controller Synthesis for Signal Temporal Logic Specifications Using Encoder-Decoder Structured Networks

Figure 4 for Neural Controller Synthesis for Signal Temporal Logic Specifications Using Encoder-Decoder Structured Networks

Abstract:In this paper, we propose a control synthesis method for signal temporal logic (STL) specifications with neural networks (NNs). Most of the previous works consider training a controller for only a given STL specification. These approaches, however, require retraining the NN controller if a new specification arises and needs to be satisfied, which results in large consumption of memory and inefficient training. To tackle this problem, we propose to construct NN controllers by introducing encoder-decoder structured NNs with an attention mechanism. The encoder takes an STL formula as input and encodes it into an appropriate vector, and the decoder outputs control signals that will meet the given specification. As the encoder, we consider three NN structures: sequential, tree-structured, and graph-structured NNs. All the model parameters are trained in an end-to-end manner to maximize the expected robustness that is known to be a quantitative semantics of STL formulae. We compare the control performances attained by the above NN structures through a numerical experiment of the path planning problem, showing the efficacy of the proposed approach.

* submitted for publication

Via

Access Paper or Ask Questions