Picture for Hongxiang Fan

Hongxiang Fan

Progressive Mixed-Precision Decoding for Efficient LLM Inference

Add code
Oct 17, 2024
Figure 1 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 2 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 3 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 4 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Viaarxiv icon

Accelerating MRI Uncertainty Estimation with Mask-based Bayesian Neural Network

Add code
Jul 07, 2024
Viaarxiv icon

Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA

Add code
Jun 24, 2024
Viaarxiv icon

Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGA

Add code
Jun 23, 2024
Viaarxiv icon

Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference

Add code
May 28, 2024
Figure 1 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Figure 2 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Figure 3 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Figure 4 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Viaarxiv icon

SAE: Single Architecture Ensemble Neural Networks

Add code
Feb 09, 2024
Viaarxiv icon

Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads

Add code
Oct 17, 2023
Viaarxiv icon

When Monte-Carlo Dropout Meets Multi-Exit: Optimizing Bayesian Neural Networks on FPGA

Add code
Aug 13, 2023
Viaarxiv icon

LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors

Add code
Oct 11, 2022
Figure 1 for LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors
Figure 2 for LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors
Figure 3 for LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors
Figure 4 for LL-GNN: Low Latency Graph Neural Networks on FPGAs for Particle Detectors
Viaarxiv icon

Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design

Add code
Sep 20, 2022
Figure 1 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 2 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 3 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 4 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Viaarxiv icon