Picture for Minbin Huang

Minbin Huang

Efficient Multi-modal Large Language Models via Visual Token Grouping

Add code
Nov 26, 2024
Figure 1 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 2 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 3 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 4 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Viaarxiv icon

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation

Add code
Oct 07, 2024
Viaarxiv icon

CAPE: Context-Adaptive Positional Encoding for Length Extrapolation

Add code
May 23, 2024
Viaarxiv icon

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Add code
May 14, 2024
Figure 1 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 2 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 3 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 4 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Viaarxiv icon

TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation

Add code
Apr 29, 2024
Figure 1 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 2 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 3 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 4 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Viaarxiv icon

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

Add code
Mar 13, 2024
Viaarxiv icon

Boosting Visual-Language Models by Exploiting Hard Samples

Add code
May 09, 2023
Viaarxiv icon

Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search

Add code
Apr 12, 2022
Figure 1 for Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
Figure 2 for Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
Figure 3 for Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
Figure 4 for Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
Viaarxiv icon