Picture for Winston Hu

Winston Hu

Beyond Intermediate States: Explaining Visual Redundancy through Language

Add code
Mar 26, 2025
Viaarxiv icon

BREEN: Bridge Data-Efficient Encoder-Free Multimodal Learning with Learnable Queries

Add code
Mar 16, 2025
Viaarxiv icon

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

Add code
Feb 06, 2025
Figure 1 for Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment
Figure 2 for Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment
Figure 3 for Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment
Figure 4 for Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment
Viaarxiv icon

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Add code
Nov 21, 2024
Figure 1 for Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Figure 2 for Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Figure 3 for Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Figure 4 for Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Viaarxiv icon

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Add code
Nov 05, 2024
Figure 1 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 2 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 3 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 4 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Viaarxiv icon

Parallel Speculative Decoding with Adaptive Draft Length

Add code
Aug 13, 2024
Viaarxiv icon