Picture for Kai Jiang

Kai Jiang

LinkDoc Technology, Beijing, China

HiGR: Efficient Generative Slate Recommendation via Hierarchical Planning and Multi-Objective Preference Alignment

Add code
Dec 31, 2025
Viaarxiv icon

PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation

Add code
Dec 30, 2025
Viaarxiv icon

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Add code
Dec 18, 2025
Viaarxiv icon

SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation

Add code
Oct 30, 2025
Figure 1 for SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation
Figure 2 for SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation
Figure 3 for SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation
Figure 4 for SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation
Viaarxiv icon

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Add code
Sep 19, 2025
Viaarxiv icon

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Add code
May 16, 2025
Viaarxiv icon

M$^3$amba: CLIP-driven Mamba Model for Multi-modal Remote Sensing Classification

Add code
Mar 09, 2025
Viaarxiv icon

Visual Generation Without Guidance

Add code
Jan 26, 2025
Figure 1 for Visual Generation Without Guidance
Figure 2 for Visual Generation Without Guidance
Figure 3 for Visual Generation Without Guidance
Figure 4 for Visual Generation Without Guidance
Viaarxiv icon

DiffCLIP: Few-shot Language-driven Multimodal Classifier

Add code
Dec 10, 2024
Figure 1 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Figure 2 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Figure 3 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Figure 4 for DiffCLIP: Few-shot Language-driven Multimodal Classifier
Viaarxiv icon

A Survey on Vision Autoregressive Model

Add code
Nov 13, 2024
Viaarxiv icon