Picture for Tianwen Wei

Tianwen Wei

Optimization Hyper-parameter Laws for Large Language Models

Add code
Sep 07, 2024
Viaarxiv icon

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On

Add code
Jul 11, 2024
Figure 1 for Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
Figure 2 for Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
Figure 3 for Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
Figure 4 for Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
Viaarxiv icon

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Add code
Jun 03, 2024
Figure 1 for Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
Figure 2 for Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
Figure 3 for Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
Figure 4 for Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
Viaarxiv icon

LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models

Add code
Jun 02, 2024
Viaarxiv icon

Skywork: A More Open Bilingual Foundation Model

Add code
Oct 30, 2023
Figure 1 for Skywork: A More Open Bilingual Foundation Model
Figure 2 for Skywork: A More Open Bilingual Foundation Model
Figure 3 for Skywork: A More Open Bilingual Foundation Model
Figure 4 for Skywork: A More Open Bilingual Foundation Model
Viaarxiv icon

SkyMath: Technical Report

Add code
Oct 26, 2023
Viaarxiv icon

CMATH: Can Your Language Model Pass Chinese Elementary School Math Test?

Add code
Jun 29, 2023
Viaarxiv icon

A Flexible Multi-Task Model for BERT Serving

Add code
Jul 12, 2021
Figure 1 for A Flexible Multi-Task Model for BERT Serving
Figure 2 for A Flexible Multi-Task Model for BERT Serving
Figure 3 for A Flexible Multi-Task Model for BERT Serving
Viaarxiv icon

Masked Conditional Random Fields for Sequence Labeling

Add code
Mar 19, 2021
Figure 1 for Masked Conditional Random Fields for Sequence Labeling
Figure 2 for Masked Conditional Random Fields for Sequence Labeling
Figure 3 for Masked Conditional Random Fields for Sequence Labeling
Figure 4 for Masked Conditional Random Fields for Sequence Labeling
Viaarxiv icon

A convergence and asymptotic analysis of the generalized symmetric FastICA algorithm

Add code
Dec 17, 2015
Figure 1 for A convergence and asymptotic analysis of the generalized symmetric FastICA algorithm
Figure 2 for A convergence and asymptotic analysis of the generalized symmetric FastICA algorithm
Figure 3 for A convergence and asymptotic analysis of the generalized symmetric FastICA algorithm
Viaarxiv icon