Picture for Runsheng Wang

Runsheng Wang

Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

Add code
Feb 01, 2026
Viaarxiv icon

ReactEMG Stroke: Healthy-to-Stroke Few-shot Adaptation for sEMG-Based Intent Detection

Add code
Jan 29, 2026
Viaarxiv icon

AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM

Add code
Nov 06, 2025
Figure 1 for AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
Figure 2 for AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
Figure 3 for AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
Figure 4 for AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
Viaarxiv icon

ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG

Add code
Jun 24, 2025
Figure 1 for ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG
Figure 2 for ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG
Figure 3 for ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG
Figure 4 for ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG
Viaarxiv icon

HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference

Add code
Apr 08, 2025
Figure 1 for HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
Figure 2 for HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
Figure 3 for HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
Figure 4 for HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
Viaarxiv icon

LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design

Add code
Feb 21, 2025
Viaarxiv icon

AnalogXpert: Automating Analog Topology Synthesis by Incorporating Circuit Design Expertise into Large Language Models

Add code
Dec 17, 2024
Viaarxiv icon

MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers

Add code
Oct 23, 2024
Figure 1 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Figure 2 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Figure 3 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Figure 4 for MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Viaarxiv icon

PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization

Add code
Oct 12, 2024
Viaarxiv icon

AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference

Add code
Aug 19, 2024
Figure 1 for AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
Figure 2 for AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
Figure 3 for AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
Figure 4 for AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
Viaarxiv icon