Picture for Alexander Toshev

Alexander Toshev

Apple

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons

Add code
Dec 11, 2024
Viaarxiv icon

DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models

Add code
Dec 11, 2024
Viaarxiv icon

World-consistent Video Diffusion with Explicit 3D Modeling

Add code
Dec 02, 2024
Viaarxiv icon

On the Modeling Capabilities of Large Language Models for Sequential Decision Making

Add code
Oct 08, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

Grounding Multimodal Large Language Models in Actions

Add code
Jun 12, 2024
Viaarxiv icon

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Add code
Mar 22, 2024
Figure 1 for MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Figure 2 for MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Figure 3 for MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Figure 4 for MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Viaarxiv icon

Scalable Pre-training of Large Autoregressive Image Models

Add code
Jan 16, 2024
Figure 1 for Scalable Pre-training of Large Autoregressive Image Models
Figure 2 for Scalable Pre-training of Large Autoregressive Image Models
Figure 3 for Scalable Pre-training of Large Autoregressive Image Models
Figure 4 for Scalable Pre-training of Large Autoregressive Image Models
Viaarxiv icon

Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation

Add code
Nov 27, 2023
Viaarxiv icon

Large Language Models as Generalizable Policies for Embodied Tasks

Add code
Oct 26, 2023
Figure 1 for Large Language Models as Generalizable Policies for Embodied Tasks
Figure 2 for Large Language Models as Generalizable Policies for Embodied Tasks
Figure 3 for Large Language Models as Generalizable Policies for Embodied Tasks
Figure 4 for Large Language Models as Generalizable Policies for Embodied Tasks
Viaarxiv icon