Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sean Wang

COPU: Conformal Prediction for Uncertainty Quantification in Natural Language Generation

Feb 18, 2025

Sean Wang, Yicheng Jiang, Yuxin Tang, Lu Cheng, Hanjie Chen

Abstract:Uncertainty Quantification (UQ) for Natural Language Generation (NLG) is crucial for assessing the performance of Large Language Models (LLMs), as it reveals confidence in predictions, identifies failure modes, and gauges output reliability. Conformal Prediction (CP), a model-agnostic method that generates prediction sets with a specified error rate, has been adopted for UQ in classification tasks, where the size of the prediction set indicates the model's uncertainty. However, when adapting CP to NLG, the sampling-based method for generating candidate outputs cannot guarantee the inclusion of the ground truth, limiting its applicability across a wide range of error rates. To address this, we propose \ourmethod, a method that explicitly adds the ground truth to the candidate outputs and uses logit scores to measure nonconformity. Our experiments with six LLMs on four NLG tasks show that \ourmethod outperforms baseline methods in calibrating error rates and empirical cover rates, offering accurate UQ across a wide range of user-specified error rates.

Via

Access Paper or Ask Questions

Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Sep 15, 2022

Eric Enouen, Katja Mathesius, Sean Wang, Arielle Carr, Sihong Xie

Figure 1 for Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Figure 2 for Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Figure 3 for Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Figure 4 for Efficient first-order predictor-corrector multiple objective optimization for fair misinformation detection

Abstract:Multiple-objective optimization (MOO) aims to simultaneously optimize multiple conflicting objectives and has found important applications in machine learning, such as minimizing classification loss and discrepancy in treating different populations for fairness. At optimality, further optimizing one objective will necessarily harm at least another objective, and decision-makers need to comprehensively explore multiple optima (called Pareto front) to pinpoint one final solution. We address the efficiency of finding the Pareto front. First, finding the front from scratch using stochastic multi-gradient descent (SMGD) is expensive with large neural networks and datasets. We propose to explore the Pareto front as a manifold from a few initial optima, based on a predictor-corrector method. Second, for each exploration step, the predictor solves a large-scale linear system that scales quadratically in the number of model parameters and requires one backpropagation to evaluate a second-order Hessian-vector product per iteration of the solver. We propose a Gauss-Newton approximation that only scales linearly, and that requires only first-order inner-product per iteration. This also allows for a choice between the MINRES and conjugate gradient methods when approximately solving the linear system. The innovations make predictor-corrector possible for large networks. Experiments on multi-objective (fairness and accuracy) misinformation detection tasks show that 1) the predictor-corrector method can find Pareto fronts better than or similar to SMGD with less time; and 2) the proposed first-order method does not harm the quality of the Pareto front identified by the second-order method, while further reduce running time.

Via

Access Paper or Ask Questions

Environmental Sampling with the Boustrophedon Decomposition Algorithm

Jul 13, 2022

Hannah He, Joe Norby, Sean Wang, Natasha Sihota, Thomas P. Hoelen, Gregory V. Lowry, Aaron M. Johnson

Figure 1 for Environmental Sampling with the Boustrophedon Decomposition Algorithm

Figure 2 for Environmental Sampling with the Boustrophedon Decomposition Algorithm

Figure 3 for Environmental Sampling with the Boustrophedon Decomposition Algorithm

Figure 4 for Environmental Sampling with the Boustrophedon Decomposition Algorithm

Abstract:The automation of data collection via mobile robots holds promise for increasing the efficacy of environmental investigations, but requires the system to autonomously determine how to sample the environment while avoiding obstacles. Existing methods such as the boustrophedon decomposition algorithm enable complete coverage of the environment to a specified resolution, yet in many cases sampling at the resolution of the distribution would yield long paths with an infeasible number of measurements. Downsampling these paths can result in feasible plans at the expense of distribution estimation accuracy. This work explores this tradeoff between distribution accuracy and path length for the boustrophedon decomposition algorithm. We quantify algorithm performance by computing metrics for accuracy and path length in a Monte-Carlo simulation across a distribution of environments. We highlight conditions where one objective should be prioritized over the other and propose a modification to the algorithm to improve its effectiveness by sampling more uniformly. These results demonstrate how intelligent deployment of the boustrophedon algorithm can effectively guide autonomous environmental sampling.

Via

Access Paper or Ask Questions

Cumulative Assessment for Urban 3D Modeling

Jul 09, 2021

Shea Hagstrom, Hee Won Pak, Stephanie Ku, Sean Wang, Gregory Hager, Myron Brown

Figure 1 for Cumulative Assessment for Urban 3D Modeling

Figure 2 for Cumulative Assessment for Urban 3D Modeling

Figure 3 for Cumulative Assessment for Urban 3D Modeling

Figure 4 for Cumulative Assessment for Urban 3D Modeling

Abstract:Urban 3D modeling from satellite images requires accurate semantic segmentation to delineate urban features, multiple view stereo for 3D reconstruction of surface heights, and 3D model fitting to produce compact models with accurate surface slopes. In this work, we present a cumulative assessment metric that succinctly captures error contributions from each of these components. We demonstrate our approach by providing challenging public datasets and extending two open source projects to provide an end-to-end 3D modeling baseline solution to stimulate further research and evaluation with a public leaderboard.

* Published in IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2021

Via

Access Paper or Ask Questions

Semantic Stereo for Incidental Satellite Images

Nov 21, 2018

Marc Bosch, Kevin Foster, Gordon Christie, Sean Wang, Gregory D Hager, Myron Brown

Figure 1 for Semantic Stereo for Incidental Satellite Images

Figure 2 for Semantic Stereo for Incidental Satellite Images

Figure 3 for Semantic Stereo for Incidental Satellite Images

Figure 4 for Semantic Stereo for Incidental Satellite Images

Abstract:The increasingly common use of incidental satellite images for stereo reconstruction versus rigidly tasked binocular or trinocular coincident collection is helping to enable timely global-scale 3D mapping; however, reliable stereo correspondence from multi-date image pairs remains very challenging due to seasonal appearance differences and scene change. Promising recent work suggests that semantic scene segmentation can provide a robust regularizing prior for resolving ambiguities in stereo correspondence and reconstruction problems. To enable research for pairwise semantic stereo and multi-view semantic 3D reconstruction with incidental satellite images, we have established a large-scale public dataset including multi-view, multi-band satellite images and ground truth geometric and semantic labels for two large cities. To demonstrate the complementary nature of the stereo and segmentation tasks, we present lightweight public baselines adapted from recent state of the art convolutional neural network models and assess their performance.

* Accepted publication at WACV 2019

Via

Access Paper or Ask Questions