Picture for Samuel Weinbach

Samuel Weinbach

u-$μ$P: The Unit-Scaled Maximal Update Parametrization

Add code
Jul 24, 2024
Viaarxiv icon

T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

Add code
Jun 27, 2024
Figure 1 for T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings
Figure 2 for T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings
Figure 3 for T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings
Figure 4 for T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings
Viaarxiv icon

Efficient Parallelization Layouts for Large-Scale Distributed Model Training

Add code
Nov 09, 2023
Viaarxiv icon

Tokenizer Choice For LLM Training: Negligible or Crucial?

Add code
Oct 18, 2023
Figure 1 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 2 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 3 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Figure 4 for Tokenizer Choice For LLM Training: Negligible or Crucial?
Viaarxiv icon

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Add code
May 24, 2023
Viaarxiv icon

AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation

Add code
Jan 23, 2023
Viaarxiv icon

M-VADER: A Model for Diffusion with Multimodal Context

Add code
Dec 07, 2022
Viaarxiv icon

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

Add code
Apr 14, 2022
Figure 1 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 2 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 3 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Figure 4 for GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Viaarxiv icon

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning

Add code
Dec 09, 2021
Figure 1 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Figure 2 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Figure 3 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Figure 4 for MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
Viaarxiv icon

Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies

Add code
Nov 12, 2020
Figure 1 for Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies
Figure 2 for Domain-Level Explainability -- A Challenge for Creating Trust in Superhuman AI Strategies
Viaarxiv icon