Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaojian Wu

Sid

The Llama 3 Herd of Models

Jul 31, 2024

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan(+521 more)

Abstract:Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.

Via

Access Paper or Ask Questions

AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Nov 11, 2021

Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Haoran Li, Mona Diab

Figure 1 for AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Figure 2 for AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Figure 3 for AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Figure 4 for AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Abstract:Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions. Each question thread can receive a large number of answers with different perspectives. One goal of answer summarization is to produce a summary that reflects the range of answer perspectives. A major obstacle for abstractive answer summarization is the absence of a dataset to provide supervision for producing such summaries. Recent works propose heuristics to create such data, but these are often noisy and do not cover all perspectives present in the answers. This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists. Our pipeline gathers annotations for all subtasks involved in answer summarization, including the selection of answer sentences relevant to the question, grouping these sentences based on perspectives, summarizing each perspective, and producing an overall summary. We analyze and benchmark state-of-the-art models on these subtasks and introduce a novel unsupervised approach for multi-perspective data augmentation, that further boosts overall summarization performance according to automatic evaluation. Finally, we propose reinforcement learning rewards to improve factual consistency and answer coverage and analyze areas for improvement.

* arXiv admin note: substantial text overlap with arXiv:2104.08536

Via

Access Paper or Ask Questions

Multi-Perspective Abstractive Answer Summarization

Apr 17, 2021

Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Mona Diab

Figure 1 for Multi-Perspective Abstractive Answer Summarization

Figure 2 for Multi-Perspective Abstractive Answer Summarization

Figure 3 for Multi-Perspective Abstractive Answer Summarization

Figure 4 for Multi-Perspective Abstractive Answer Summarization

Abstract:Community Question Answering (CQA) forums such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of questions. Each question thread can receive a large number of answers with different perspectives. The goal of multi-perspective answer summarization is to produce a summary that includes all perspectives of the answer. A major obstacle for multi-perspective, abstractive answer summarization is the absence of a dataset to provide supervision for producing such summaries. This work introduces a novel dataset creation method to automatically create multi-perspective, bullet-point abstractive summaries from an existing CQA forum. Supervision provided by this dataset trains models to inherently produce multi-perspective summaries. Additionally, to train models to output more diverse, faithful answer summaries while retaining multiple perspectives, we propose a multi-reward optimization technique coupled with a sentence-relevance prediction multi-task loss. Our methods demonstrate improved coverage of perspectives and faithfulness as measured by automatic and human evaluations compared to a strong baseline.

Via

Access Paper or Ask Questions

Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network

Feb 08, 2018

Johan Bjorck, Yiwei Bai, Xiaojian Wu, Yexiang Xue, Mark C. Whitmore, Carla Gomes

Figure 1 for Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network

Figure 2 for Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network

Figure 3 for Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network

Figure 4 for Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network

Abstract:Cascades represent rapid changes in networks. A cascading phenomenon of ecological and economic impact is the spread of invasive species in geographic landscapes. The most promising management strategy is often biocontrol, which entails introducing a natural predator able to control the invading population, a setting that can be treated as two interacting cascades of predator and prey populations. We formulate and study a nonlinear problem of optimal biocontrol: optimally seeding the predator cascade over time to minimize the harmful prey population. Recurring budgets, which typically face conservation organizations, naturally leads to sparse constraints which make the problem amenable to approximation algorithms. Available methods based on continuous relaxations scale poorly, to remedy this we develop a novel and scalable randomized algorithm based on a width relaxation, applicable to a broad class of combinatorial optimization problems. We evaluate our contributions in the context of biocontrol for the insect pest Hemlock Wolly Adelgid (HWA) in eastern North America. Our algorithm outperforms competing methods in terms of scalability and solution quality, and finds near optimal strategies for the control of the HWA for fine-grained networks -- an important problem in computational sustainability.

* AAAI 2018

Via

Access Paper or Ask Questions

XOR-Sampling for Network Design with Correlated Stochastic Events

May 24, 2017

Xiaojian Wu, Yexiang Xue, Bart Selman, Carla P. Gomes

Figure 1 for XOR-Sampling for Network Design with Correlated Stochastic Events

Figure 2 for XOR-Sampling for Network Design with Correlated Stochastic Events

Figure 3 for XOR-Sampling for Network Design with Correlated Stochastic Events

Figure 4 for XOR-Sampling for Network Design with Correlated Stochastic Events

Abstract:Many network optimization problems can be formulated as stochastic network design problems in which edges are present or absent stochastically. Furthermore, protective actions can guarantee that edges will remain present. We consider the problem of finding the optimal protection strategy under a budget limit in order to maximize some connectivity measurements of the network. Previous approaches rely on the assumption that edges are independent. In this paper, we consider a more realistic setting where multiple edges are not independent due to natural disasters or regional events that make the states of multiple edges stochastically correlated. We use Markov Random Fields to model the correlation and define a new stochastic network design framework. We provide a novel algorithm based on Sample Average Approximation (SAA) coupled with a Gibbs or XOR sampler. The experimental results on real road network data show that the policies produced by SAA with the XOR sampler have higher quality and lower variance compared to SAA with Gibbs sampler.

* In Proceedings of the Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17). The first two authors contribute equally

Via

Access Paper or Ask Questions

Robust Optimization for Tree-Structured Stochastic Network Design

Dec 01, 2016

Xiaojian Wu, Akshat Kumar, Daniel Sheldon, Shlomo Zilberstein

Figure 1 for Robust Optimization for Tree-Structured Stochastic Network Design

Figure 2 for Robust Optimization for Tree-Structured Stochastic Network Design

Figure 3 for Robust Optimization for Tree-Structured Stochastic Network Design

Figure 4 for Robust Optimization for Tree-Structured Stochastic Network Design

Abstract:Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river network optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which is unrealistic in real- world settings. We therefore address the robust river network design problem where the goal is to optimize river connectivity for fish movement by removing barriers. We assume that fish passability probabilities are known only imprecisely, but are within some interval bounds. We then develop a planning approach that computes the policies with either high robust ratio or low regret. Empirically, our approach scales well to large river networks. We also provide insights into the solutions generated by our robust approach, which has significantly higher robust ratio than the baseline solution with mean parameter estimates.

* AAAI 2017

Via

Access Paper or Ask Questions

Solving Multistage Influence Diagrams using Branch-and-Bound Search

Mar 15, 2012

Changhe Yuan, Xiaojian Wu, Eric A. Hansen

Figure 1 for Solving Multistage Influence Diagrams using Branch-and-Bound Search

Figure 2 for Solving Multistage Influence Diagrams using Branch-and-Bound Search

Figure 3 for Solving Multistage Influence Diagrams using Branch-and-Bound Search

Figure 4 for Solving Multistage Influence Diagrams using Branch-and-Bound Search

Abstract:A branch-and-bound approach to solving influ- ence diagrams has been previously proposed in the literature, but appears to have never been implemented and evaluated - apparently due to the difficulties of computing effective bounds for the branch-and-bound search. In this paper, we describe how to efficiently compute effective bounds, and we develop a practical implementa- tion of depth-first branch-and-bound search for influence diagram evaluation that outperforms existing methods for solving influence diagrams with multiple stages.

* Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

Via

Access Paper or Ask Questions