Picture for Zhenpeng Su

Zhenpeng Su

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts

Add code
Oct 21, 2024
Viaarxiv icon

Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval

Add code
Aug 20, 2024
Viaarxiv icon

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

Add code
Jul 13, 2024
Viaarxiv icon

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

Add code
Apr 27, 2024
Viaarxiv icon

InfoEntropy Loss to Mitigate Bias of Learning Difficulties for Generative Language Models

Add code
Nov 01, 2023
Viaarxiv icon

HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus

Add code
Sep 06, 2023
Viaarxiv icon

T5-SR: A Unified Seq-to-Seq Decoding Strategy for Semantic Parsing

Add code
Jun 14, 2023
Viaarxiv icon

ConTextual Masked Auto-Encoder for Retrieval-based Dialogue Systems

Add code
Jun 14, 2023
Viaarxiv icon