Picture for Kunpeng Li

Kunpeng Li

Transfer between Modalities with MetaQueries

Add code
Apr 08, 2025
Viaarxiv icon

MoCha: Towards Movie-Grade Talking Character Synthesis

Add code
Mar 30, 2025
Viaarxiv icon

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant

Add code
Mar 06, 2025
Viaarxiv icon

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Add code
Dec 30, 2024
Figure 1 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 2 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 3 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 4 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Viaarxiv icon

Movie Gen: A Cast of Media Foundation Models

Add code
Oct 17, 2024
Figure 1 for Movie Gen: A Cast of Media Foundation Models
Figure 2 for Movie Gen: A Cast of Media Foundation Models
Figure 3 for Movie Gen: A Cast of Media Foundation Models
Figure 4 for Movie Gen: A Cast of Media Foundation Models
Viaarxiv icon

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Add code
Dec 29, 2023
Viaarxiv icon

ControlRoom3D: Room Generation using Semantic Proxy Rooms

Add code
Dec 08, 2023
Figure 1 for ControlRoom3D: Room Generation using Semantic Proxy Rooms
Figure 2 for ControlRoom3D: Room Generation using Semantic Proxy Rooms
Figure 3 for ControlRoom3D: Room Generation using Semantic Proxy Rooms
Figure 4 for ControlRoom3D: Room Generation using Semantic Proxy Rooms
Viaarxiv icon

Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack

Add code
Sep 27, 2023
Figure 1 for Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
Figure 2 for Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
Figure 3 for Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
Figure 4 for Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack
Viaarxiv icon

A Close Look at Spatial Modeling: From Attention to Convolution

Add code
Dec 23, 2022
Viaarxiv icon

Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP

Add code
Oct 09, 2022
Figure 1 for Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Figure 2 for Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Figure 3 for Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Figure 4 for Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Viaarxiv icon