Picture for Victor Shea-Jay Huang

Victor Shea-Jay Huang

TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation

Add code
Mar 10, 2025
Viaarxiv icon

Towards Precise Scaling Laws for Video Diffusion Transformers

Add code
Nov 25, 2024
Figure 1 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 2 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 3 for Towards Precise Scaling Laws for Video Diffusion Transformers
Figure 4 for Towards Precise Scaling Laws for Video Diffusion Transformers
Viaarxiv icon

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge

Add code
Nov 25, 2024
Viaarxiv icon

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Add code
Oct 29, 2024
Figure 1 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 2 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 3 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 4 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Viaarxiv icon

BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models

Add code
Oct 13, 2024
Figure 1 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Figure 2 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Figure 3 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Figure 4 for BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Viaarxiv icon