Picture for Shaobo Ma

Shaobo Ma

FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization

Add code
May 25, 2025
Viaarxiv icon

FlashForge: Ultra-Efficient Prefix-Aware Attention for LLM Decoding

Add code
May 23, 2025
Viaarxiv icon

Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores

Add code
Sep 26, 2024
Viaarxiv icon

Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment

Add code
Jul 16, 2024
Viaarxiv icon