Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aaditya Singh

Jack

The Llama 3 Herd of Models

Jul 31, 2024

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan(+521 more)

Abstract:Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.

Via

Access Paper or Ask Questions

Benchmarking Low-Shot Robustness to Natural Distribution Shifts

Apr 21, 2023

Aaditya Singh, Kartik Sarangmath, Prithvijit Chattopadhyay, Judy Hoffman

Figure 1 for Benchmarking Low-Shot Robustness to Natural Distribution Shifts

Figure 2 for Benchmarking Low-Shot Robustness to Natural Distribution Shifts

Figure 3 for Benchmarking Low-Shot Robustness to Natural Distribution Shifts

Figure 4 for Benchmarking Low-Shot Robustness to Natural Distribution Shifts

Abstract:Robustness to natural distribution shifts has seen remarkable progress thanks to recent pre-training strategies combined with better fine-tuning methods. However, such fine-tuning assumes access to large amounts of labelled data, and the extent to which the observations hold when the amount of training data is not as high remains unknown. We address this gap by performing the first in-depth study of robustness to various natural distribution shifts in different low-shot regimes: spanning datasets, architectures, pre-trained initializations, and state-of-the-art robustness interventions. Most importantly, we find that there is no single model of choice that is often more robust than others, and existing interventions can fail to improve robustness on some datasets even if they do so in the full-shot regime. We hope that our work will motivate the community to focus on this problem of practical importance.

* 21 Pages, 18 Tables, 12 Figures

Via

Access Paper or Ask Questions

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Jun 16, 2022

Viraj Prabhu, Sriram Yenamandra, Aaditya Singh, Judy Hoffman

Figure 1 for Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Figure 2 for Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Figure 3 for Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Figure 4 for Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Abstract:Visual domain adaptation (DA) seeks to transfer trained models to unseen, unlabeled domains across distribution shift, but approaches typically focus on adapting convolutional neural network architectures initialized with supervised ImageNet representations. In this work, we shift focus to adapting modern architectures for object recognition -- the increasingly popular Vision Transformer (ViT) -- and modern pretraining based on self-supervised learning (SSL). Inspired by the design of recent SSL approaches based on learning from partial image inputs generated via masking or cropping -- either by learning to predict the missing pixels, or learning representational invariances to such augmentations -- we propose PACMAC, a simple two-stage adaptation algorithm for self-supervised ViTs. PACMAC first performs in-domain SSL on pooled source and target data to learn task-discriminative features, and then probes the model's predictive consistency across a set of partial target inputs generated via a novel attention-conditioned masking strategy, to identify reliable candidates for self-training. Our simple approach leads to consistent performance gains over competing methods that use ViTs and self-supervised initializations on standard object recognition benchmarks. Code available at https://github.com/virajprabhu/PACMAC

Via

Access Paper or Ask Questions

SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance Normalization

May 20, 2021

Aaditya Singh, Shreeshail Hingane, Xinyu Gong, Zhangyang Wang

Figure 1 for SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance Normalization

Figure 2 for SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance Normalization

Figure 3 for SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance Normalization

Figure 4 for SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance Normalization

Abstract:Artistic style transfer aims to transfer the style characteristics of one image onto another image while retaining its content. Existing approaches commonly leverage various normalization techniques, although these face limitations in adequately transferring diverse textures to different spatial locations. Self-Attention-based approaches have tackled this issue with partial success but suffer from unwanted artifacts. Motivated by these observations, this paper aims to combine the best of both worlds: self-attention and normalization. That yields a new plug-and-play module that we name Self-Attentive Factorized Instance Normalization (SAFIN). SAFIN is essentially a spatially adaptive normalization module whose parameters are inferred through attention on the content and style image. We demonstrate that plugging SAFIN into the base network of another state-of-the-art method results in enhanced stylization. We also develop a novel base network composed of Wavelet Transform for multi-scale style transfer, which when combined with SAFIN, produces visually appealing results with lesser unwanted textures.

* Accepted at ICME 2021, 5 Pages + 1 Page (references)

Via

Access Paper or Ask Questions

An End-to-End Network for Emotion-Cause Pair Extraction

Mar 03, 2021

Aaditya Singh, Shreeshail Hingane, Saim Wani, Ashutosh Modi

Figure 1 for An End-to-End Network for Emotion-Cause Pair Extraction

Figure 2 for An End-to-End Network for Emotion-Cause Pair Extraction

Figure 3 for An End-to-End Network for Emotion-Cause Pair Extraction

Figure 4 for An End-to-End Network for Emotion-Cause Pair Extraction

Abstract:The task of Emotion-Cause Pair Extraction (ECPE) aims to extract all potential clause-pairs of emotions and their corresponding causes in a document. Unlike the more well-studied task of Emotion Cause Extraction (ECE), ECPE does not require the emotion clauses to be provided as annotations. Previous works on ECPE have either followed a multi-stage approach where emotion extraction, cause extraction, and pairing are done independently or use complex architectures to resolve its limitations. In this paper, we propose an end-to-end model for the ECPE task. Due to the unavailability of an English language ECPE corpus, we adapt the NTCIR-13 ECE corpus and establish a baseline for the ECPE task on this dataset. On this dataset, the proposed method produces significant performance improvements (~6.5 increase in F1 score) over the multi-stage approach and achieves comparable performance to the state-of-the-art methods.

* Accepted at WASSA-2021, 5 Pages + 2 Pages (references) + 2 Pages (Appendix)

Via

Access Paper or Ask Questions