Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xin Qiu

Scalable Bayesian Monte Carlo: fast uncertainty estimation beyond deep ensembles

May 19, 2025

Xinzhu Liang, Joseph M. Lukens, Sanjaya Lohani, Brian T. Kirby, Thomas A. Searles, Xin Qiu, Kody J. H. Law

Abstract:This work introduces a new method called scalable Bayesian Monte Carlo (SBMC). The model interpolates between a point estimator and the posterior, and the algorithm is a parallel implementation of a consistent (asymptotically unbiased) Bayesian deep learning algorithm: sequential Monte Carlo (SMC) or Markov chain Monte Carlo (MCMC). The method is motivated theoretically, and its utility is demonstrated on practical examples: MNIST, CIFAR, IMDb. A systematic numerical study reveals that parallel implementations of SMC and MCMC are comparable to serial implementations in terms of performance and total cost, and they achieve accuracy at or beyond the state-of-the-art (SOTA) methods like deep ensembles at convergence, along with substantially improved uncertainty quantification (UQ)--in particular, epistemic UQ. But even parallel implementations are expensive, with an irreducible time barrier much larger than the cost of the MAP estimator. Compressing time further leads to rapid degradation of accuracy, whereas UQ remains valuable. By anchoring to a point estimator we can recover accuracy, while retaining valuable UQ, ultimately delivering strong performance across metrics for a cost comparable to the SOTA.

* 56 pages, 44 figures, 35 tables

Via

Access Paper or Ask Questions

Position: Scaling LLM Agents Requires Asymptotic Analysis with LLM Primitives

Feb 04, 2025

Elliot Meyerson, Xin Qiu

Abstract:Decomposing hard problems into subproblems often makes them easier and more efficient to solve. With large language models (LLMs) crossing critical reliability thresholds for a growing slate of capabilities, there is an increasing effort to decompose systems into sets of LLM-based agents, each of whom can be delegated sub-tasks. However, this decomposition (even when automated) is often intuitive, e.g., based on how a human might assign roles to members of a human team. How close are these role decompositions to optimal? This position paper argues that asymptotic analysis with LLM primitives is needed to reason about the efficiency of such decomposed systems, and that insights from such analysis will unlock opportunities for scaling them. By treating the LLM forward pass as the atomic unit of computational cost, one can separate out the (often opaque) inner workings of a particular LLM from the inherent efficiency of how a set of LLMs are orchestrated to solve hard problems. In other words, if we want to scale the deployment of LLMs to the limit, instead of anthropomorphizing LLMs, asymptotic analysis with LLM primitives should be used to reason about and develop more powerful decompositions of large problems into LLM agents.

* 12 pages including references

Via

Access Paper or Ask Questions

Semantic Density: Uncertainty Quantification in Semantic Space for Large Language Models

May 22, 2024

Xin Qiu, Risto Miikkulainen

Abstract:With the widespread application of Large Language Models (LLMs) to various domains, concerns regarding the trustworthiness of LLMs in safety-critical scenarios have been raised, due to their unpredictable tendency to hallucinate and generate misinformation. Existing LLMs do not have an inherent functionality to provide the users with an uncertainty metric for each response it generates, making it difficult to evaluate trustworthiness. Although a number of works aim to develop uncertainty quantification methods for LLMs, they have fundamental limitations, such as being restricted to classification tasks, requiring additional training and data, considering only lexical instead of semantic information, and being prompt-wise but not response-wise. A new framework is proposed in this paper to address these issues. Semantic density extracts uncertainty information for each response from a probability distribution perspective in semantic space. It has no restriction on task types and is "off-the-shelf" for new models and tasks. Experiments on seven state-of-the-art LLMs, including the latest Llama 3 and Mixtral-8x22B models, on four free-form question-answering benchmarks demonstrate the superior performance and robustness of semantic density compared to prior approaches.

* 16 pages, 2 figures

Via

Access Paper or Ask Questions

Shortest Edit Path Crossover: A Theory-driven Solution to the Permutation Problem in Evolutionary Neural Architecture Search

Oct 27, 2022

Xin Qiu, Risto Miikkulainen

Abstract:Evolutionary algorithms (EAs) have gained attention recently due to their success in neural architecture search (NAS). However, whereas traditional EAs draw much power from crossover operations, most evolutionary NAS methods deploy only mutation operators. The main reason is the permutation problem: The mapping between genotype and phenotype in traditional graph representations is many-to-one, leading to a disruptive effect of standard crossover. This work conducts the first theoretical analysis of the behaviors of crossover and mutation in the NAS context, and proposes a new crossover operator based on the shortest edit path (SEP) in graph space. The SEP crossover is shown to overcome the permutation problem, and as a result, offspring generated by the SEP crossover is theoretically proved to have a better expected improvement in terms of graph edit distance to global optimum, compared to mutation and standard crossover. Experiments further show that the SEP crossover significantly outperforms mutation and standard crossover on three state-of-the-art NAS benchmarks. The SEP crossover therefore allows taking full advantage of evolution in NAS, and potentially other similar design problems as well.

* 17 pages, 6 figures

Via

Access Paper or Ask Questions

BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Apr 03, 2022

Jason Dai, Ding Ding, Dongjie Shi, Shengsheng Huang, Jiao Wang, Xin Qiu, Kai Huang, Guoqiong Song, Yang Wang, Qiyuan Gong(+6 more)

Figure 1 for BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Figure 2 for BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Figure 3 for BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Figure 4 for BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

Abstract:Most AI projects start with a Python notebook running on a single laptop; however, one usually needs to go through a mountain of pains to scale it to handle larger dataset (for both experimentation and production deployment). These usually entail many manual and error-prone steps for the data scientists to fully take advantage of the available hardware resources (e.g., SIMD instructions, multi-processing, quantization, memory allocation optimization, data partitioning, distributed computing, etc.). To address this challenge, we have open sourced BigDL 2.0 at https://github.com/intel-analytics/BigDL/ under Apache 2.0 license (combining the original BigDL and Analytics Zoo projects); using BigDL 2.0, users can simply build conventional Python notebooks on their laptops (with possible AutoML support), which can then be transparently accelerated on a single node (with up-to 9.6x speedup in our experiments), and seamlessly scaled out to a large cluster (across several hundreds servers in real-world use cases). BigDL 2.0 has already been adopted by many real-world users (such as Mastercard, Burger King, Inspur, etc.) in production.

* Accepted by CVPR 2022 (Demo Track)

Via

Access Paper or Ask Questions

Simple Genetic Operators are Universal Approximators of Probability Distributions (and other Advantages of Expressive Encodings)

Feb 19, 2022

Elliot Meyerson, Xin Qiu, Risto Miikkulainen

Figure 1 for Simple Genetic Operators are Universal Approximators of Probability Distributions (and other Advantages of Expressive Encodings)

Figure 2 for Simple Genetic Operators are Universal Approximators of Probability Distributions (and other Advantages of Expressive Encodings)

Figure 3 for Simple Genetic Operators are Universal Approximators of Probability Distributions (and other Advantages of Expressive Encodings)

Figure 4 for Simple Genetic Operators are Universal Approximators of Probability Distributions (and other Advantages of Expressive Encodings)

Abstract:This paper characterizes the inherent power of evolutionary algorithms. This power depends on the computational properties of the genetic encoding. With some encodings, two parents recombined with a simple crossover operator can sample from an arbitrary distribution of child phenotypes. Such encodings are termed \emph{expressive encodings} in this paper. Universal function approximators, including popular evolutionary substrates of genetic programming and neural networks, can be used to construct expressive encodings. Remarkably, this approach need not be applied only to domains where the phenotype is a function: Expressivity can be achieved even when optimizing static structures, such as binary vectors. Such simpler settings make it possible to characterize expressive encodings theoretically: Across a variety of test problems, expressive encodings are shown to achieve up to super-exponential convergence speed-ups over the standard direct encoding. The conclusion is that, across evolutionary computation areas as diverse as genetic programming, neuroevolution, genetic algorithms, and theory, expressive encodings can be a key to understanding and realizing the full power of evolution.

* 10 pages, with 4 page appendix

Via

Access Paper or Ask Questions

Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model

Oct 05, 2020

Xin Qiu, Risto Miikkulainen

Figure 1 for Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model

Figure 2 for Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model

Figure 3 for Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model

Figure 4 for Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model

Abstract:As neural network classifiers are deployed in real-world applications, it is crucial that their predictions are not just accurate, but trustworthy as well. One practical solution is to assign confidence scores to each prediction, then filter out low-confidence predictions. However, existing confidence metrics are not yet sufficiently reliable for this role. This paper presents a new framework that produces more reliable confidence scores for detecting misclassification errors. This framework, RED, calibrates the classifier's inherent confidence indicators and estimates uncertainty of the calibrated confidence scores using Gaussian Processes. Empirical comparisons with other confidence estimation methods on 125 UCI datasets demonstrate that this approach is effective. An experiment on a vision task with a large deep learning architecture further confirms that the method can scale up, and a case study involving out-of-distribution and adversarial samples shows potential of the proposed method to improve robustness of neural network classifiers more broadly in the future.

Via

Access Paper or Ask Questions

From Prediction to Prescription: AI-Based Optimization of Non-Pharmaceutical Interventions for the COVID-19 Pandemic

May 30, 2020

Risto Miikkulainen, Olivier Francon, Elliot Meyerson, Xin Qiu, Elisa Canzani, Babak Hodjat

Figure 1 for From Prediction to Prescription: AI-Based Optimization of Non-Pharmaceutical Interventions for the COVID-19 Pandemic

Figure 2 for From Prediction to Prescription: AI-Based Optimization of Non-Pharmaceutical Interventions for the COVID-19 Pandemic

Figure 3 for From Prediction to Prescription: AI-Based Optimization of Non-Pharmaceutical Interventions for the COVID-19 Pandemic

Figure 4 for From Prediction to Prescription: AI-Based Optimization of Non-Pharmaceutical Interventions for the COVID-19 Pandemic

Abstract:Several models have been developed to predict how the COVID-19 pandemic spreads, and how it could be contained with non-pharmaceutical interventions (NPIs) such as social distancing restrictions and school and business closures. This paper demonstrates how evolutionary AI could be used to facilitate the next step, i.e. determining most effective intervention strategies automatically. Through evolutionary surrogate-assisted prescription (ESP), it is possible to generate a large number of candidate strategies and evaluate them with predictive models. In principle, strategies can be customized for different countries and locales, and balance the need to contain the pandemic and the need to minimize their economic impact. While still limited by available data, early experiments suggest that workplace and school restrictions are the most important and need to be designed carefully. It also demonstrates that results of lifting restrictions can be unreliable, and suggests creative ways in which restrictions can be implemented softly, e.g. by alternating them over time. As more data becomes available, the approach can be increasingly useful in dealing with COVID-19 as well as possible future pandemics.

Via

Access Paper or Ask Questions

Effective Reinforcement Learning through Evolutionary Surrogate-Assisted Prescription

Feb 13, 2020

Olivier Francon, Santiago Gonzalez, Babak Hodjat, Elliot Meyerson, Risto Miikkulainen, Xin Qiu, Hormoz Shahrzad

Figure 1 for Effective Reinforcement Learning through Evolutionary Surrogate-Assisted Prescription

Figure 2 for Effective Reinforcement Learning through Evolutionary Surrogate-Assisted Prescription

Figure 3 for Effective Reinforcement Learning through Evolutionary Surrogate-Assisted Prescription

Figure 4 for Effective Reinforcement Learning through Evolutionary Surrogate-Assisted Prescription

Abstract:There is now significant historical data available on decision making in organizations, consisting of the decision problem, what decisions were made, and how desirable the outcomes were. Using this data, it is possible to learn a surrogate model, and with that model, evolve a decision strategy that optimizes the outcomes. This paper introduces a general such approach, called Evolutionary Surrogate-Assisted Prescription, or ESP. The surrogate is, for example, a random forest or a neural network trained with gradient descent, and the strategy is a neural network that is evolved to maximize the predictions of the surrogate model. ESP is further extended in this paper to sequential decision-making tasks, which makes it possible to evaluate the framework in reinforcement learning (RL) benchmarks. Because the majority of evaluations are done on the surrogate, ESP is more sample efficient, has lower variance, and lower regret than standard RL approaches. Surprisingly, its solutions are also better because both the surrogate and the strategy network regularize the decision-making behavior. ESP thus forms a promising foundation to decision optimization in real-world problems.

Via

Access Paper or Ask Questions

Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

Jun 03, 2019

Xin Qiu, Elliot Meyerson, Risto Miikkulainen

Figure 1 for Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

Figure 2 for Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

Figure 3 for Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

Figure 4 for Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel

Abstract:Neural Networks (NNs) have been extensively used for a wide spectrum of real-world regression tasks, where the goal is to predict a numerical outcome such as revenue, effectiveness, or a quantitative result. In many such tasks, the point prediction is not enough, but also the uncertainty (i.e. risk, or confidence) of that prediction must be estimated. Standard NNs, which are most often used in such tasks, do not provide any such information. Existing approaches try to solve this issue by combining Bayesian models with NNs, but these models are hard to implement, more expensive to train, and usually do not perform as well as standard NNs. In this paper, a new framework called RIO is developed that makes it possible to estimate uncertainty in any pretrained standard NN. RIO models prediction residuals using Gaussian Process with a composite input/output kernel. The residual prediction and I/O kernel are theoretically motivated and the framework is evaluated in twelve real-world datasets. It is found to provide reliable estimates of the uncertainty, reduce the error of the point predictions, and scale well to large datasets. Given that RIO can be applied to any standard NN without modifications to model architecture or training pipeline, it provides an important ingredient in building real-world applications of NNs.

Via

Access Paper or Ask Questions