Picture for Alexandros Kouris

Alexandros Kouris

Progressive Mixed-Precision Decoding for Efficient LLM Inference

Add code
Oct 17, 2024
Figure 1 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 2 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 3 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 4 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Viaarxiv icon

Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads

Add code
Oct 17, 2023
Viaarxiv icon

The Future of Consumer Edge-AI Computing

Add code
Oct 19, 2022
Figure 1 for The Future of Consumer Edge-AI Computing
Figure 2 for The Future of Consumer Edge-AI Computing
Figure 3 for The Future of Consumer Edge-AI Computing
Viaarxiv icon

Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs

Add code
Sep 27, 2022
Figure 1 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Figure 2 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Figure 3 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Figure 4 for Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Viaarxiv icon

Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design

Add code
Sep 20, 2022
Figure 1 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 2 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 3 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Figure 4 for Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Viaarxiv icon

Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions

Add code
Jun 09, 2021
Figure 1 for Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions
Figure 2 for Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions
Figure 3 for Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions
Viaarxiv icon

Multi-Exit Semantic Segmentation Networks

Add code
Jun 07, 2021
Figure 1 for Multi-Exit Semantic Segmentation Networks
Figure 2 for Multi-Exit Semantic Segmentation Networks
Figure 3 for Multi-Exit Semantic Segmentation Networks
Figure 4 for Multi-Exit Semantic Segmentation Networks
Viaarxiv icon

Approximate LSTMs for Time-Constrained Inference: Enabling Fast Reaction in Self-Driving Cars

Add code
May 02, 2019
Figure 1 for Approximate LSTMs for Time-Constrained Inference: Enabling Fast Reaction in Self-Driving Cars
Figure 2 for Approximate LSTMs for Time-Constrained Inference: Enabling Fast Reaction in Self-Driving Cars
Figure 3 for Approximate LSTMs for Time-Constrained Inference: Enabling Fast Reaction in Self-Driving Cars
Figure 4 for Approximate LSTMs for Time-Constrained Inference: Enabling Fast Reaction in Self-Driving Cars
Viaarxiv icon

CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks

Add code
Jul 13, 2018
Figure 1 for CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks
Figure 2 for CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks
Figure 3 for CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks
Figure 4 for CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks
Viaarxiv icon

Deploying Deep Neural Networks in the Embedded Space

Add code
Jun 22, 2018
Figure 1 for Deploying Deep Neural Networks in the Embedded Space
Figure 2 for Deploying Deep Neural Networks in the Embedded Space
Figure 3 for Deploying Deep Neural Networks in the Embedded Space
Figure 4 for Deploying Deep Neural Networks in the Embedded Space
Viaarxiv icon