Picture for Kun Yan

Kun Yan

DSV: Exploiting Dynamic Sparsity to Accelerate Large-Scale Video DiT Training

Add code
Feb 11, 2025
Viaarxiv icon

Taming Teacher Forcing for Masked Autoregressive Video Generation

Add code
Jan 21, 2025
Viaarxiv icon

Modelling Multi-modal Cross-interaction for ML-FSIC Based on Local Feature Selection

Add code
Dec 18, 2024
Viaarxiv icon

Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding

Add code
Jul 11, 2024
Viaarxiv icon

G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios

Add code
May 13, 2024
Viaarxiv icon

KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

Add code
Jul 10, 2023
Viaarxiv icon

GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

Add code
Jun 27, 2023
Viaarxiv icon

Two-shot Video Object Segmentation

Add code
Mar 21, 2023
Viaarxiv icon

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

Add code
Nov 16, 2022
Viaarxiv icon

HORIZON: A High-Resolution Panorama Synthesis Framework

Add code
Oct 10, 2022
Figure 1 for HORIZON: A High-Resolution Panorama Synthesis Framework
Figure 2 for HORIZON: A High-Resolution Panorama Synthesis Framework
Figure 3 for HORIZON: A High-Resolution Panorama Synthesis Framework
Figure 4 for HORIZON: A High-Resolution Panorama Synthesis Framework
Viaarxiv icon