Picture for Kai Wang

Kai Wang

Refer to the report for detailed contributions

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Surgical Scene Segmentation using a Spike-Driven Video Transformer with Real-Time Potential

Add code
Dec 24, 2025
Viaarxiv icon

LumiCtrl : Learning Illuminant Prompts for Lighting Control in Personalized Text-to-Image Models

Add code
Dec 19, 2025
Viaarxiv icon

StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models

Add code
Dec 18, 2025
Figure 1 for StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models
Figure 2 for StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models
Figure 3 for StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models
Figure 4 for StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models
Viaarxiv icon

Distill Video Datasets into Images

Add code
Dec 16, 2025
Figure 1 for Distill Video Datasets into Images
Figure 2 for Distill Video Datasets into Images
Figure 3 for Distill Video Datasets into Images
Figure 4 for Distill Video Datasets into Images
Viaarxiv icon

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

Add code
Dec 16, 2025
Figure 1 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Figure 2 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Figure 3 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Figure 4 for HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
Viaarxiv icon

Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark

Add code
Nov 17, 2025
Figure 1 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Figure 2 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Figure 3 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Figure 4 for Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark
Viaarxiv icon

Bridging Constraints and Stochasticity: A Fully First-Order Method for Stochastic Bilevel Optimization with Linear Constraints

Add code
Nov 15, 2025
Figure 1 for Bridging Constraints and Stochasticity: A Fully First-Order Method for Stochastic Bilevel Optimization with Linear Constraints
Viaarxiv icon

Explainable AI-Generated Image Detection RewardBench

Add code
Nov 15, 2025
Viaarxiv icon

Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation

Add code
Nov 14, 2025
Figure 1 for Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation
Figure 2 for Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation
Figure 3 for Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation
Figure 4 for Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation
Viaarxiv icon