Picture for Yingfa Chen

Yingfa Chen

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Add code
Nov 04, 2024
Figure 1 for Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Figure 2 for Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Figure 3 for Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Figure 4 for Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Viaarxiv icon

Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling

Add code
Oct 09, 2024
Figure 1 for Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling
Figure 2 for Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling
Figure 3 for Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling
Figure 4 for Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling
Viaarxiv icon

Configurable Foundation Models: Building LLMs from a Modular Perspective

Add code
Sep 04, 2024
Viaarxiv icon

Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts

Add code
Sep 02, 2024
Figure 1 for Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Figure 2 for Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Figure 3 for Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Figure 4 for Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Viaarxiv icon

Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models

Add code
Jun 22, 2024
Figure 1 for Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
Figure 2 for Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
Figure 3 for Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
Figure 4 for Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models
Viaarxiv icon

Robust and Scalable Model Editing for Large Language Models

Add code
Mar 26, 2024
Viaarxiv icon

$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens

Add code
Feb 24, 2024
Viaarxiv icon

READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises

Add code
Feb 14, 2023
Viaarxiv icon

SHUOWEN-JIEZI: Linguistically Informed Tokenizers For Chinese Language Model Pretraining

Add code
Jun 01, 2021
Figure 1 for SHUOWEN-JIEZI: Linguistically Informed Tokenizers For Chinese Language Model Pretraining
Figure 2 for SHUOWEN-JIEZI: Linguistically Informed Tokenizers For Chinese Language Model Pretraining
Figure 3 for SHUOWEN-JIEZI: Linguistically Informed Tokenizers For Chinese Language Model Pretraining
Figure 4 for SHUOWEN-JIEZI: Linguistically Informed Tokenizers For Chinese Language Model Pretraining
Viaarxiv icon