Picture for Hongfei Xu

Hongfei Xu

Knowledge-injected Prompt Learning for Chinese Biomedical Entity Normalization

Add code
Aug 23, 2023
Viaarxiv icon

Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue

Add code
Aug 14, 2023
Viaarxiv icon

Optimizing Deep Transformers for Chinese-Thai Low-Resource Translation

Add code
Dec 24, 2022
Viaarxiv icon

NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering

Add code
Nov 07, 2022
Viaarxiv icon

Learning Hard Retrieval Cross Attention for Transformer

Add code
Sep 30, 2020
Figure 1 for Learning Hard Retrieval Cross Attention for Transformer
Figure 2 for Learning Hard Retrieval Cross Attention for Transformer
Figure 3 for Learning Hard Retrieval Cross Attention for Transformer
Figure 4 for Learning Hard Retrieval Cross Attention for Transformer
Viaarxiv icon

Transformer with Depth-Wise LSTM

Add code
Jul 13, 2020
Figure 1 for Transformer with Depth-Wise LSTM
Figure 2 for Transformer with Depth-Wise LSTM
Figure 3 for Transformer with Depth-Wise LSTM
Figure 4 for Transformer with Depth-Wise LSTM
Viaarxiv icon

Learning Source Phrase Representations for Neural Machine Translation

Add code
Jun 25, 2020
Figure 1 for Learning Source Phrase Representations for Neural Machine Translation
Figure 2 for Learning Source Phrase Representations for Neural Machine Translation
Figure 3 for Learning Source Phrase Representations for Neural Machine Translation
Figure 4 for Learning Source Phrase Representations for Neural Machine Translation
Viaarxiv icon

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

Add code
May 05, 2020
Figure 1 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Figure 2 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Figure 3 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Figure 4 for Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Viaarxiv icon

Analyzing Word Translation of Transformer Layers

Add code
Mar 21, 2020
Figure 1 for Analyzing Word Translation of Transformer Layers
Figure 2 for Analyzing Word Translation of Transformer Layers
Figure 3 for Analyzing Word Translation of Transformer Layers
Figure 4 for Analyzing Word Translation of Transformer Layers
Viaarxiv icon

Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization

Add code
Nov 08, 2019
Figure 1 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Figure 2 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Figure 3 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Figure 4 for Why Deep Transformers are Difficult to Converge? From Computation Order to Lipschitz Restricted Parameter Initialization
Viaarxiv icon