Picture for Yangyu Tao

Yangyu Tao

Refer to the report for detailed contributions

HunyuanVideo: A Systematic Framework For Large Video Generative Models

Add code
Dec 03, 2024
Viaarxiv icon

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Add code
Nov 05, 2024
Figure 1 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 2 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 3 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 4 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Viaarxiv icon

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs

Add code
Jul 16, 2024
Figure 1 for Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs
Figure 2 for Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs
Figure 3 for Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs
Figure 4 for Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs
Viaarxiv icon

BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics

Add code
May 27, 2024
Figure 1 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Figure 2 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Figure 3 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Figure 4 for BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics
Viaarxiv icon

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Add code
May 23, 2024
Figure 1 for Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Figure 2 for Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Figure 3 for Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Figure 4 for Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Viaarxiv icon

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Add code
May 14, 2024
Figure 1 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 2 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 3 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Figure 4 for Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Viaarxiv icon

Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent

Add code
Mar 06, 2023
Viaarxiv icon

BlindFL: Vertical Federated Machine Learning without Peeking into Your Data

Add code
Jun 16, 2022
Figure 1 for BlindFL: Vertical Federated Machine Learning without Peeking into Your Data
Figure 2 for BlindFL: Vertical Federated Machine Learning without Peeking into Your Data
Figure 3 for BlindFL: Vertical Federated Machine Learning without Peeking into Your Data
Figure 4 for BlindFL: Vertical Federated Machine Learning without Peeking into Your Data
Viaarxiv icon

Graph Attention Multi-Layer Perceptron

Add code
Jun 09, 2022
Figure 1 for Graph Attention Multi-Layer Perceptron
Figure 2 for Graph Attention Multi-Layer Perceptron
Figure 3 for Graph Attention Multi-Layer Perceptron
Figure 4 for Graph Attention Multi-Layer Perceptron
Viaarxiv icon

PaSca: a Graph Neural Architecture Search System under the Scalable Paradigm

Add code
Mar 01, 2022
Figure 1 for PaSca: a Graph Neural Architecture Search System under the Scalable Paradigm
Figure 2 for PaSca: a Graph Neural Architecture Search System under the Scalable Paradigm
Figure 3 for PaSca: a Graph Neural Architecture Search System under the Scalable Paradigm
Figure 4 for PaSca: a Graph Neural Architecture Search System under the Scalable Paradigm
Viaarxiv icon