Picture for Houwen Peng

Houwen Peng

Stephen

Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning

Add code
Mar 17, 2025
Viaarxiv icon

HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models

Add code
Mar 14, 2025
Viaarxiv icon

ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws

Add code
Aug 15, 2024
Viaarxiv icon

Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Add code
May 30, 2024
Viaarxiv icon

Common 7B Language Models Already Possess Strong Math Capabilities

Add code
Mar 07, 2024
Figure 1 for Common 7B Language Models Already Possess Strong Math Capabilities
Figure 2 for Common 7B Language Models Already Possess Strong Math Capabilities
Figure 3 for Common 7B Language Models Already Possess Strong Math Capabilities
Figure 4 for Common 7B Language Models Already Possess Strong Math Capabilities
Viaarxiv icon

FP8-LM: Training FP8 Large Language Models

Add code
Oct 27, 2023
Figure 1 for FP8-LM: Training FP8 Large Language Models
Figure 2 for FP8-LM: Training FP8 Large Language Models
Figure 3 for FP8-LM: Training FP8 Large Language Models
Figure 4 for FP8-LM: Training FP8 Large Language Models
Viaarxiv icon

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance

Add code
Sep 21, 2023
Viaarxiv icon

Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking

Add code
Aug 14, 2023
Figure 1 for Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking
Figure 2 for Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking
Figure 3 for Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking
Figure 4 for Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking
Viaarxiv icon

ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation

Add code
Aug 02, 2023
Viaarxiv icon

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

Add code
May 11, 2023
Viaarxiv icon