Picture for Shwai He

Shwai He

SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning

Add code
Apr 14, 2025
Viaarxiv icon

Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts

Add code
Mar 07, 2025
Viaarxiv icon

Fair Diagnosis: Leveraging Causal Modeling to Mitigate Medical Bias

Add code
Dec 06, 2024
Viaarxiv icon

Towards counterfactual fairness thorough auxiliary variables

Add code
Dec 06, 2024
Viaarxiv icon

Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers

Add code
Oct 17, 2024
Viaarxiv icon

What Matters in Transformers? Not All Attention is Needed

Add code
Jun 22, 2024
Viaarxiv icon

Demystifying the Compression of Mixture-of-Experts Through a Unified Framework

Add code
Jun 04, 2024
Figure 1 for Demystifying the Compression of Mixture-of-Experts Through a Unified Framework
Figure 2 for Demystifying the Compression of Mixture-of-Experts Through a Unified Framework
Figure 3 for Demystifying the Compression of Mixture-of-Experts Through a Unified Framework
Figure 4 for Demystifying the Compression of Mixture-of-Experts Through a Unified Framework
Viaarxiv icon

Loki: Low-Rank Keys for Efficient Sparse Attention

Add code
Jun 04, 2024
Figure 1 for Loki: Low-Rank Keys for Efficient Sparse Attention
Figure 2 for Loki: Low-Rank Keys for Efficient Sparse Attention
Figure 3 for Loki: Low-Rank Keys for Efficient Sparse Attention
Figure 4 for Loki: Low-Rank Keys for Efficient Sparse Attention
Viaarxiv icon

RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation

Add code
Apr 03, 2024
Viaarxiv icon

Reformatted Alignment

Add code
Feb 19, 2024
Viaarxiv icon