Picture for Anil Kag

Anil Kag

SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices

Add code
Jan 13, 2026
Viaarxiv icon

Diffusion-DRF: Differentiable Reward Flow for Video Diffusion Fine-Tuning

Add code
Jan 07, 2026
Viaarxiv icon

Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization

Add code
Dec 11, 2025
Viaarxiv icon

Taming Diffusion Transformer for Real-Time Mobile Video Generation

Add code
Jul 17, 2025
Figure 1 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Figure 2 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Figure 3 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Figure 4 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Viaarxiv icon

H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models

Add code
Apr 14, 2025
Viaarxiv icon

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

Add code
Feb 05, 2025
Viaarxiv icon

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Add code
Dec 13, 2024
Figure 1 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 2 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 3 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 4 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Viaarxiv icon

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Add code
Dec 12, 2024
Viaarxiv icon

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation

Add code
Nov 07, 2024
Figure 1 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 2 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 3 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 4 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Viaarxiv icon

Scalable Ranked Preference Optimization for Text-to-Image Generation

Add code
Oct 23, 2024
Figure 1 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 2 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 3 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 4 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Viaarxiv icon