Picture for Zhen Yang

Zhen Yang

School of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 2100023, China

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Add code
Nov 05, 2024
Figure 1 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 2 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 3 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 4 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Viaarxiv icon

Lossless KV Cache Compression to 2%

Add code
Oct 20, 2024
Viaarxiv icon

DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction

Add code
Sep 30, 2024
Figure 1 for DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction
Figure 2 for DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction
Figure 3 for DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction
Figure 4 for DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction
Viaarxiv icon

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Add code
Sep 30, 2024
Viaarxiv icon

Thought-Path Contrastive Learning via Premise-Oriented Data Augmentation for Logical Reading Comprehension

Add code
Sep 24, 2024
Viaarxiv icon

HMoE: Heterogeneous Mixture of Experts for Language Modeling

Add code
Aug 20, 2024
Figure 1 for HMoE: Heterogeneous Mixture of Experts for Language Modeling
Figure 2 for HMoE: Heterogeneous Mixture of Experts for Language Modeling
Figure 3 for HMoE: Heterogeneous Mixture of Experts for Language Modeling
Figure 4 for HMoE: Heterogeneous Mixture of Experts for Language Modeling
Viaarxiv icon

Relevance Filtering for Embedding-based Retrieval

Add code
Aug 09, 2024
Viaarxiv icon

LICM: Effective and Efficient Long Interest Chain Modeling for News Recommendation

Add code
Aug 01, 2024
Viaarxiv icon

FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior

Add code
Jul 06, 2024
Viaarxiv icon

NARRepair: Non-Autoregressive Code Generation Model for Automatic Program Repair

Add code
Jun 24, 2024
Viaarxiv icon