Picture for Cihang Xie

Cihang Xie

University of California, Santa Cruz

Generative Image Layer Decomposition with Visual Effects

Add code
Nov 26, 2024
Viaarxiv icon

CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions

Add code
Nov 25, 2024
Viaarxiv icon

M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation

Add code
Nov 15, 2024
Figure 1 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Figure 2 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Figure 3 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Figure 4 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Viaarxiv icon

AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation

Add code
Oct 11, 2024
Figure 1 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Figure 2 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Figure 3 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Figure 4 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Viaarxiv icon

Causal Image Modeling for Efficient Visual Understanding

Add code
Oct 10, 2024
Viaarxiv icon

VHELM: A Holistic Evaluation of Vision Language Models

Add code
Oct 09, 2024
Figure 1 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 2 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 3 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 4 for VHELM: A Holistic Evaluation of Vision Language Models
Viaarxiv icon

From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation

Add code
Sep 02, 2024
Viaarxiv icon

VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges

Add code
Sep 02, 2024
Figure 1 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Figure 2 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Figure 3 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Figure 4 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Viaarxiv icon

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Add code
Aug 06, 2024
Viaarxiv icon

VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

Add code
Jun 24, 2024
Figure 1 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Figure 2 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Figure 3 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Figure 4 for VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models
Viaarxiv icon