Picture for Jingze Shi

Jingze Shi

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale

Add code
Feb 05, 2026
Viaarxiv icon

Towards Automated Kernel Generation in the Era of LLMs

Add code
Jan 26, 2026
Viaarxiv icon

TransXSSM: A Hybrid Transformer State Space Model with Unified Rotary Position Embedding

Add code
Jun 12, 2025
Viaarxiv icon

Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

Add code
May 26, 2025
Viaarxiv icon

Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture

Add code
Dec 16, 2024
Viaarxiv icon

Cheems: Wonderful Matrices More Efficient and More Effective Architecture

Add code
Jul 25, 2024
Figure 1 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Figure 2 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Figure 3 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Figure 4 for Cheems: Wonderful Matrices More Efficient and More Effective Architecture
Viaarxiv icon

OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser

Add code
Jun 25, 2024
Figure 1 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Figure 2 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Figure 3 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Figure 4 for OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser
Viaarxiv icon