Picture for Cihang Xie

Cihang Xie

University of California, Santa Cruz

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More

Add code
Feb 06, 2025
Viaarxiv icon

ARFlow: Autogressive Flow with Hybrid Linear Attention

Add code
Jan 27, 2025
Viaarxiv icon

Double Visual Defense: Adversarial Pre-training and Instruction Tuning for Improving Vision-Language Model Robustness

Add code
Jan 16, 2025
Viaarxiv icon

Generative Image Layer Decomposition with Visual Effects

Add code
Nov 26, 2024
Figure 1 for Generative Image Layer Decomposition with Visual Effects
Figure 2 for Generative Image Layer Decomposition with Visual Effects
Figure 3 for Generative Image Layer Decomposition with Visual Effects
Figure 4 for Generative Image Layer Decomposition with Visual Effects
Viaarxiv icon

CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions

Add code
Nov 25, 2024
Figure 1 for CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions
Figure 2 for CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions
Figure 3 for CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions
Figure 4 for CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions
Viaarxiv icon

M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation

Add code
Nov 15, 2024
Figure 1 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Figure 2 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Figure 3 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Figure 4 for M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Viaarxiv icon

AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation

Add code
Oct 11, 2024
Figure 1 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Figure 2 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Figure 3 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Figure 4 for AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
Viaarxiv icon

Causal Image Modeling for Efficient Visual Understanding

Add code
Oct 10, 2024
Figure 1 for Causal Image Modeling for Efficient Visual Understanding
Figure 2 for Causal Image Modeling for Efficient Visual Understanding
Figure 3 for Causal Image Modeling for Efficient Visual Understanding
Figure 4 for Causal Image Modeling for Efficient Visual Understanding
Viaarxiv icon

VHELM: A Holistic Evaluation of Vision Language Models

Add code
Oct 09, 2024
Figure 1 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 2 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 3 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 4 for VHELM: A Holistic Evaluation of Vision Language Models
Viaarxiv icon

From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation

Add code
Sep 02, 2024
Figure 1 for From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
Figure 2 for From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
Figure 3 for From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
Figure 4 for From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
Viaarxiv icon