Picture for Liangliang Cao

Liangliang Cao

Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

Add code
Oct 14, 2024
Viaarxiv icon

MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models

Add code
Oct 13, 2024
Viaarxiv icon

Apple Intelligence Foundation Language Models

Add code
Jul 29, 2024
Figure 1 for Apple Intelligence Foundation Language Models
Figure 2 for Apple Intelligence Foundation Language Models
Figure 3 for Apple Intelligence Foundation Language Models
Figure 4 for Apple Intelligence Foundation Language Models
Viaarxiv icon

Diffusion Model-Based Image Editing: A Survey

Add code
Feb 27, 2024
Viaarxiv icon

Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models

Add code
Dec 26, 2023
Figure 1 for Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models
Figure 2 for Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models
Figure 3 for Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models
Figure 4 for Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models
Viaarxiv icon

Ferret: Refer and Ground Anything Anywhere at Any Granularity

Add code
Oct 11, 2023
Viaarxiv icon

Efficient-3DiM: Learning a Generalizable Single-image Novel-view Synthesizer in One Day

Add code
Oct 04, 2023
Viaarxiv icon

Instruction-Following Speech Recognition

Add code
Sep 18, 2023
Viaarxiv icon

RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture

Add code
May 18, 2023
Viaarxiv icon

Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness

Add code
May 08, 2023
Viaarxiv icon