Picture for Yuying Ge

Yuying Ge

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Add code
Dec 05, 2024
Viaarxiv icon

Moto: Latent Motion Token as the Bridging Language for Robot Manipulation

Add code
Dec 05, 2024
Figure 1 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Figure 2 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Figure 3 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Figure 4 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Viaarxiv icon

EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios

Add code
Dec 05, 2024
Figure 1 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Figure 2 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Figure 3 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Figure 4 for EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios
Viaarxiv icon

DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models

Add code
Dec 05, 2024
Viaarxiv icon

SEED-Story: Multimodal Long Story Generation with Large Language Model

Add code
Jul 11, 2024
Viaarxiv icon

SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

Add code
May 07, 2024
Viaarxiv icon

SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Add code
Apr 25, 2024
Viaarxiv icon

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

Add code
Apr 22, 2024
Viaarxiv icon

Supervised Fine-tuning in turn Improves Visual Foundation Models

Add code
Jan 18, 2024
Viaarxiv icon

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

Add code
Dec 14, 2023
Viaarxiv icon