Picture for Rui Men

Rui Men

additional authors not shown

Qwen2.5-1M Technical Report

Add code
Jan 26, 2025
Viaarxiv icon

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Add code
Jan 21, 2025
Viaarxiv icon

Qwen2.5 Technical Report

Add code
Dec 19, 2024
Viaarxiv icon

Qwen2.5-Coder Technical Report

Add code
Sep 18, 2024
Figure 1 for Qwen2.5-Coder Technical Report
Figure 2 for Qwen2.5-Coder Technical Report
Figure 3 for Qwen2.5-Coder Technical Report
Figure 4 for Qwen2.5-Coder Technical Report
Viaarxiv icon

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Add code
Sep 18, 2024
Figure 1 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Figure 2 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Figure 3 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Figure 4 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon

Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Add code
Jun 07, 2024
Figure 1 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach
Figure 2 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach
Figure 3 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach
Figure 4 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach
Viaarxiv icon

Qwen Technical Report

Add code
Sep 28, 2023
Figure 1 for Qwen Technical Report
Figure 2 for Qwen Technical Report
Figure 3 for Qwen Technical Report
Figure 4 for Qwen Technical Report
Viaarxiv icon

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

Add code
Dec 08, 2022
Viaarxiv icon

Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese

Add code
Nov 03, 2022
Figure 1 for Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
Figure 2 for Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
Figure 3 for Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
Figure 4 for Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese
Viaarxiv icon