Picture for Rewon Child

Rewon Child

PaLM: Scaling Language Modeling with Pathways

Add code
Apr 19, 2022
Figure 1 for PaLM: Scaling Language Modeling with Pathways
Figure 2 for PaLM: Scaling Language Modeling with Pathways
Figure 3 for PaLM: Scaling Language Modeling with Pathways
Figure 4 for PaLM: Scaling Language Modeling with Pathways
Viaarxiv icon

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

Add code
Feb 04, 2022
Figure 1 for Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Figure 2 for Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Figure 3 for Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Figure 4 for Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Viaarxiv icon

Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images

Add code
Nov 20, 2020
Figure 1 for Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
Figure 2 for Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
Figure 3 for Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
Figure 4 for Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
Viaarxiv icon

Language Models are Few-Shot Learners

Add code
Jun 05, 2020
Figure 1 for Language Models are Few-Shot Learners
Figure 2 for Language Models are Few-Shot Learners
Figure 3 for Language Models are Few-Shot Learners
Figure 4 for Language Models are Few-Shot Learners
Viaarxiv icon

Scaling Laws for Neural Language Models

Add code
Jan 23, 2020
Figure 1 for Scaling Laws for Neural Language Models
Figure 2 for Scaling Laws for Neural Language Models
Figure 3 for Scaling Laws for Neural Language Models
Figure 4 for Scaling Laws for Neural Language Models
Viaarxiv icon

Generating Long Sequences with Sparse Transformers

Add code
Apr 23, 2019
Figure 1 for Generating Long Sequences with Sparse Transformers
Figure 2 for Generating Long Sequences with Sparse Transformers
Figure 3 for Generating Long Sequences with Sparse Transformers
Figure 4 for Generating Long Sequences with Sparse Transformers
Viaarxiv icon

Exploring Neural Transducers for End-to-End Speech Recognition

Add code
Jul 24, 2017
Figure 1 for Exploring Neural Transducers for End-to-End Speech Recognition
Figure 2 for Exploring Neural Transducers for End-to-End Speech Recognition
Figure 3 for Exploring Neural Transducers for End-to-End Speech Recognition
Figure 4 for Exploring Neural Transducers for End-to-End Speech Recognition
Viaarxiv icon

Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

Add code
Jul 04, 2017
Figure 1 for Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Figure 2 for Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Figure 3 for Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Figure 4 for Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Viaarxiv icon

Reducing Bias in Production Speech Models

Add code
May 11, 2017
Figure 1 for Reducing Bias in Production Speech Models
Figure 2 for Reducing Bias in Production Speech Models
Figure 3 for Reducing Bias in Production Speech Models
Figure 4 for Reducing Bias in Production Speech Models
Viaarxiv icon

Active Learning for Speech Recognition: the Power of Gradients

Add code
Dec 10, 2016
Figure 1 for Active Learning for Speech Recognition: the Power of Gradients
Figure 2 for Active Learning for Speech Recognition: the Power of Gradients
Figure 3 for Active Learning for Speech Recognition: the Power of Gradients
Viaarxiv icon