Picture for Deli Chen

Deli Chen

Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Add code
Jul 02, 2024
Figure 1 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Figure 2 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Figure 3 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Figure 4 for Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models
Viaarxiv icon

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Add code
Jun 17, 2024
Viaarxiv icon

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Add code
Jan 11, 2024
Viaarxiv icon

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Add code
Jan 05, 2024
Figure 1 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 2 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 3 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Figure 4 for DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Viaarxiv icon

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

Add code
Dec 28, 2023
Figure 1 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Figure 2 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Figure 3 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Figure 4 for Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Viaarxiv icon

Towards Codable Text Watermarking for Large Language Models

Add code
Jul 29, 2023
Viaarxiv icon

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

Add code
May 23, 2023
Viaarxiv icon

Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias

Add code
May 08, 2023
Viaarxiv icon

Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning

Add code
Jan 26, 2023
Viaarxiv icon

Topology-Imbalance Learning for Semi-Supervised Node Classification

Add code
Oct 08, 2021
Figure 1 for Topology-Imbalance Learning for Semi-Supervised Node Classification
Figure 2 for Topology-Imbalance Learning for Semi-Supervised Node Classification
Figure 3 for Topology-Imbalance Learning for Semi-Supervised Node Classification
Figure 4 for Topology-Imbalance Learning for Semi-Supervised Node Classification
Viaarxiv icon