Picture for Christopher Pal

Christopher Pal

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving

Add code
Jun 12, 2025
Viaarxiv icon

Rendering-Aware Reinforcement Learning for Vector Graphics Generation

Add code
May 27, 2025
Figure 1 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 2 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 3 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Figure 4 for Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Viaarxiv icon

Distilling semantically aware orders for autoregressive image generation

Add code
Apr 23, 2025
Viaarxiv icon

AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery

Add code
Apr 10, 2025
Figure 1 for AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery
Figure 2 for AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery
Figure 3 for AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery
Figure 4 for AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery
Viaarxiv icon

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

Add code
Mar 28, 2025
Viaarxiv icon

StarFlow: Generating Structured Workflow Outputs From Sketch Images

Add code
Mar 27, 2025
Figure 1 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Figure 2 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Figure 3 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Figure 4 for StarFlow: Generating Structured Workflow Outputs From Sketch Images
Viaarxiv icon

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

Add code
Mar 19, 2025
Viaarxiv icon

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Add code
Feb 03, 2025
Figure 1 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 2 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 3 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Figure 4 for AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Viaarxiv icon

LLMs for Literature Review: Are we there yet?

Add code
Dec 15, 2024
Figure 1 for LLMs for Literature Review: Are we there yet?
Figure 2 for LLMs for Literature Review: Are we there yet?
Figure 3 for LLMs for Literature Review: Are we there yet?
Figure 4 for LLMs for Literature Review: Are we there yet?
Viaarxiv icon

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Add code
Dec 05, 2024
Figure 1 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 2 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 3 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Figure 4 for BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks
Viaarxiv icon