Picture for Ruofei Zhang

Ruofei Zhang

MERGE: Fast Private Text Generation

Add code
May 25, 2023
Viaarxiv icon

Healing Unsafe Dialogue Responses with Weak Supervision Signals

Add code
May 25, 2023
Viaarxiv icon

A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation

Add code
May 23, 2022
Figure 1 for A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation
Figure 2 for A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation
Figure 3 for A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation
Figure 4 for A Self-Paced Mixed Distillation Method for Non-Autoregressive Generation
Viaarxiv icon

Taming Sparsely Activated Transformer with Stochastic Experts

Add code
Oct 12, 2021
Figure 1 for Taming Sparsely Activated Transformer with Stochastic Experts
Figure 2 for Taming Sparsely Activated Transformer with Stochastic Experts
Figure 3 for Taming Sparsely Activated Transformer with Stochastic Experts
Figure 4 for Taming Sparsely Activated Transformer with Stochastic Experts
Viaarxiv icon

KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning

Add code
Sep 14, 2021
Figure 1 for KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning
Figure 2 for KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning
Figure 3 for KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning
Figure 4 for KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning
Viaarxiv icon

EL-Attention: Memory Efficient Lossless Attention for Generation

Add code
May 11, 2021
Figure 1 for EL-Attention: Memory Efficient Lossless Attention for Generation
Figure 2 for EL-Attention: Memory Efficient Lossless Attention for Generation
Figure 3 for EL-Attention: Memory Efficient Lossless Attention for Generation
Figure 4 for EL-Attention: Memory Efficient Lossless Attention for Generation
Viaarxiv icon

ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation

Add code
Apr 16, 2021
Figure 1 for ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation
Figure 2 for ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation
Figure 3 for ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation
Figure 4 for ProphetNet-X: Large-Scale Pre-training Models for English, Chinese, Multi-lingual, Dialog, and Code Generation
Viaarxiv icon

Mask Attention Networks: Rethinking and Strengthen Transformer

Add code
Mar 25, 2021
Figure 1 for Mask Attention Networks: Rethinking and Strengthen Transformer
Figure 2 for Mask Attention Networks: Rethinking and Strengthen Transformer
Figure 3 for Mask Attention Networks: Rethinking and Strengthen Transformer
Figure 4 for Mask Attention Networks: Rethinking and Strengthen Transformer
Viaarxiv icon

TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search

Add code
Feb 09, 2021
Figure 1 for TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search
Figure 2 for TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search
Figure 3 for TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search
Figure 4 for TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search
Viaarxiv icon

BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining

Add code
Dec 31, 2020
Figure 1 for BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
Figure 2 for BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
Figure 3 for BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
Figure 4 for BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
Viaarxiv icon