Picture for Furu Wei

Furu Wei

Little Giants: Synthesizing High-Quality Embedding Data at Scale

Add code
Oct 24, 2024
Viaarxiv icon

1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Add code
Oct 21, 2024
Viaarxiv icon

One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks

Add code
Oct 14, 2024
Viaarxiv icon

Self-Boosting Large Language Models with Synthetic Preference Data

Add code
Oct 09, 2024
Figure 1 for Self-Boosting Large Language Models with Synthetic Preference Data
Figure 2 for Self-Boosting Large Language Models with Synthetic Preference Data
Figure 3 for Self-Boosting Large Language Models with Synthetic Preference Data
Figure 4 for Self-Boosting Large Language Models with Synthetic Preference Data
Viaarxiv icon

Data Selection via Optimal Control for Language Models

Add code
Oct 09, 2024
Viaarxiv icon

Differential Transformer

Add code
Oct 07, 2024
Figure 1 for Differential Transformer
Figure 2 for Differential Transformer
Figure 3 for Differential Transformer
Figure 4 for Differential Transformer
Viaarxiv icon

Scaling Optimal LR Across Token Horizon

Add code
Sep 30, 2024
Viaarxiv icon

Q-Sparse: All Large Language Models can be Fully Sparsely-Activated

Add code
Jul 15, 2024
Viaarxiv icon

Autoregressive Speech Synthesis without Vector Quantization

Add code
Jul 11, 2024
Viaarxiv icon

Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning

Add code
Jul 08, 2024
Viaarxiv icon