Abstract:Solving societal problems on a global scale requires the collection and processing of ideas and methods from diverse sets of international experts. As the number and diversity of human experts increase, so does the likelihood that elements in this collective knowledge can be combined and refined to discover novel and better solutions. However, it is difficult to identify, combine, and refine complementary information in an increasingly large and diverse knowledge base. This paper argues that artificial intelligence (AI) can play a crucial role in this process. An evolutionary AI framework, termed RHEA, fills this role by distilling knowledge from diverse models created by human experts into equivalent neural networks, which are then recombined and refined in a population-based search. The framework was implemented in a formal synthetic domain, demonstrating that it is transparent and systematic. It was then applied to the results of the XPRIZE Pandemic Response Challenge, in which over 100 teams of experts across 23 countries submitted models based on diverse methodologies to predict COVID-19 cases and suggest non-pharmaceutical intervention policies for 235 nations, states, and regions across the globe. Building upon this expert knowledge, by recombining and refining the 169 resulting policy suggestion models, RHEA discovered a broader and more effective set of policies than either AI or human experts alone, as evaluated based on real-world data. The results thus suggest that AI can play a crucial role in realizing the potential of human expertise in global problem-solving.
Abstract:This paper introduces an innovative approach to boost the efficiency and scalability of Evolutionary Rule-based machine Learning (ERL), a key technique in explainable AI. While traditional ERL systems can distribute processes across multiple CPUs, fitness evaluation of candidate rules is a bottleneck, especially with large datasets. The method proposed in this paper, AERL (Accelerated ERL) solves this problem in two ways. First, by adopting GPU-optimized rule sets through a tensorized representation within the PyTorch framework, AERL mitigates the bottleneck and accelerates fitness evaluation significantly. Second, AERL takes further advantage of the GPUs by fine-tuning the rule coefficients via back-propagation, thereby improving search space exploration. Experimental evidence confirms that AERL search is faster and more effective, thus empowering explainable artificial intelligence.
Abstract:With the widespread application of Large Language Models (LLMs) to various domains, concerns regarding the trustworthiness of LLMs in safety-critical scenarios have been raised, due to their unpredictable tendency to hallucinate and generate misinformation. Existing LLMs do not have an inherent functionality to provide the users with an uncertainty metric for each response it generates, making it difficult to evaluate trustworthiness. Although a number of works aim to develop uncertainty quantification methods for LLMs, they have fundamental limitations, such as being restricted to classification tasks, requiring additional training and data, considering only lexical instead of semantic information, and being prompt-wise but not response-wise. A new framework is proposed in this paper to address these issues. Semantic density extracts uncertainty information for each response from a probability distribution perspective in semantic space. It has no restriction on task types and is "off-the-shelf" for new models and tasks. Experiments on seven state-of-the-art LLMs, including the latest Llama 3 and Mixtral-8x22B models, on four free-form question-answering benchmarks demonstrate the superior performance and robustness of semantic density compared to prior approaches.
Abstract:Diabetes, a chronic condition that impairs how the body turns food into energy, i.e. blood glucose, affects 38 million people in the US alone. The standard treatment is to supplement carbohydrate intake with an artificial pancreas, i.e. a continuous insulin pump (basal shots), as well as occasional insulin injections (bolus shots). The goal of the treatment is to keep blood glucose at the center of an acceptable range, as measured through a continuous glucose meter. A secondary goal is to minimize injections, which are unpleasant and difficult for some patients to implement. In this study, neuroevolution was used to discover an optimal strategy for the treatment. Based on a dataset of 30 days of treatment and measurements of a single patient, a random forest was first trained to predict future glucose levels. A neural network was then evolved to prescribe carbohydrates, basal pumping levels, and bolus injections. Evolution discovered a Pareto front that reduced deviation from the target and number of injections compared to the original data, thus improving patients' quality of life. To make the system easier to adopt, a language interface was developed with a large language model. Thus, these technologies not only improve patient care but also adoption in a broader population.
Abstract:How areas of land are allocated for different uses, such as forests, urban, and agriculture, has a large effect on carbon balance, and therefore climate change. Based on available historical data on changes in land use and a simulation of carbon emissions/absorption, a surrogate model can be learned that makes it possible to evaluate the different options available to decision-makers efficiently. An evolutionary search process can then be used to discover effective land-use policies for specific locations. Such a system was built on the Project Resilience platform and evaluated with the Land-Use Harmonization dataset and the BLUE simulator. It generates Pareto fronts that trade off carbon impact and amount of change customized to different locations, thus providing a potentially useful tool for land-use planning.
Abstract:Many evolutionary algorithms (EAs) take advantage of parallel evaluation of candidates. However, if evaluation times vary significantly, many worker nodes (i.e.,\ compute clients) are idle much of the time, waiting for the next generation to be created. Evolutionary neural architecture search (ENAS), a class of EAs that optimizes the architecture and hyperparameters of deep neural networks, is particularly vulnerable to this issue. This paper proposes a generic asynchronous evaluation strategy (AES) that is then adapted to work with ENAS. AES increases throughput by maintaining a queue of upto $K$ individuals ready to be sent to the workers for evaluation and proceeding to the next generation as soon as $M<<K$ individuals have been evaluated by the workers. A suitable value for $M$ is determined experimentally, balancing diversity and efficiency. To showcase the generality and power of AES, it was first evaluated in 11-bit multiplexer design (a single-population verifiable discovery task) and then scaled up to ENAS for image captioning (a multi-population open-ended-optimization task). In both problems, a multifold performance improvement was observed, suggesting that AES is a promising method for parallelizing the evolution of complex systems with long and variable evaluation times, such as those in ENAS.
Abstract:While evolutionary computation is well suited for automatic discovery in engineering, it can also be used to gain insight into how humans and organizations could perform more effectively. Using a real-world problem of innovation search in organizations as the motivating example, this article first formalizes human creative problem solving as competitive multi-agent search (CMAS). CMAS is different from existing single-agent and team search problems in that the agents interact through knowledge of other agents' searches and through the dynamic changes in the search landscape that result from these searches. The main hypothesis is that evolutionary computation can be used to discover effective strategies for CMAS; this hypothesis is verified in a series of experiments on the NK model, i.e.\ partially correlated and tunably rugged fitness landscapes. Different specialized strategies are evolved for each different competitive environment, and also general strategies that perform well across environments. These strategies are more effective and more complex than hand-designed strategies and a strategy based on traditional tree search. Using a novel spherical visualization of such landscapes, insight is gained about how successful strategies work, e.g.\ by tracking positive changes in the landscape. The article thus provides a possible framework for studying various human creative activities as competitive multi-agent search in the future.
Abstract:In building practical applications of evolutionary computation (EC), two optimizations are essential. First, the parameters of the search method need to be tuned to the domain in order to balance exploration and exploitation effectively. Second, the search method needs to be distributed to take advantage of parallel computing resources. This paper presents BLADE (BLAnket Distributed Evolution) as an approach to achieving both goals simultaneously. BLADE uses blankets (i.e., masks on the genetic representation) to tune the evolutionary operators during the search, and implements the search through hub-and-spoke distribution. In the paper, (1) the blanket method is formalized for the (1 + 1)EA case as a Markov chain process. Its effectiveness is then demonstrated by analyzing dominant and subdominant eigenvalues of stochastic matrices, suggesting a generalizable theory; (2) the fitness-level theory is used to analyze the distribution method; and (3) these insights are verified experimentally on three benchmark problems, showing that both blankets and distribution lead to accelerated evolution. Moreover, a surprising synergy emerges between them: When combined with distribution, the blanket approach achieves more than $n$-fold speedup with $n$ clients in some cases. The work thus highlights the importance and potential of optimizing evolutionary computation in practical applications.
Abstract:Carefully designed activation functions can improve the performance of neural networks in many machine learning tasks. However, it is difficult for humans to construct optimal activation functions, and current activation function search algorithms are prohibitively expensive. This paper aims to improve the state of the art through three steps: First, the benchmark datasets Act-Bench-CNN, Act-Bench-ResNet, and Act-Bench-ViT were created by training convolutional, residual, and vision transformer architectures from scratch with 2,913 systematically generated activation functions. Second, a characterization of the benchmark space was developed, leading to a new surrogate-based method for optimization. More specifically, the spectrum of the Fisher information matrix associated with the model's predictive distribution at initialization and the activation function's output distribution were found to be highly predictive of performance. Third, the surrogate was used to discover improved activation functions in CIFAR-100 and ImageNet tasks. Each of these steps is a contribution in its own right; together they serve as a practical and theoretical foundation for further research on activation function optimization. Code is available at https://github.com/cognizant-ai-labs/aquasurf, and the benchmark datasets are at https://github.com/cognizant-ai-labs/act-bench.
Abstract:Evolutionary algorithms (EAs) have gained attention recently due to their success in neural architecture search (NAS). However, whereas traditional EAs draw much power from crossover operations, most evolutionary NAS methods deploy only mutation operators. The main reason is the permutation problem: The mapping between genotype and phenotype in traditional graph representations is many-to-one, leading to a disruptive effect of standard crossover. This work conducts the first theoretical analysis of the behaviors of crossover and mutation in the NAS context, and proposes a new crossover operator based on the shortest edit path (SEP) in graph space. The SEP crossover is shown to overcome the permutation problem, and as a result, offspring generated by the SEP crossover is theoretically proved to have a better expected improvement in terms of graph edit distance to global optimum, compared to mutation and standard crossover. Experiments further show that the SEP crossover significantly outperforms mutation and standard crossover on three state-of-the-art NAS benchmarks. The SEP crossover therefore allows taking full advantage of evolution in NAS, and potentially other similar design problems as well.