Picture for Stylianos I. Venieris

Stylianos I. Venieris

FedP$^2$EFT: Federated Learning to Personalize Parameter Efficient Fine-Tuning for Multilingual LLMs

Add code
Feb 05, 2025
Figure 1 for FedP$^2$EFT: Federated Learning to Personalize Parameter Efficient Fine-Tuning for Multilingual LLMs
Figure 2 for FedP$^2$EFT: Federated Learning to Personalize Parameter Efficient Fine-Tuning for Multilingual LLMs
Figure 3 for FedP$^2$EFT: Federated Learning to Personalize Parameter Efficient Fine-Tuning for Multilingual LLMs
Figure 4 for FedP$^2$EFT: Federated Learning to Personalize Parameter Efficient Fine-Tuning for Multilingual LLMs
Viaarxiv icon

MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference

Add code
Dec 05, 2024
Figure 1 for MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
Figure 2 for MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
Figure 3 for MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
Figure 4 for MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
Viaarxiv icon

Progressive Mixed-Precision Decoding for Efficient LLM Inference

Add code
Oct 17, 2024
Figure 1 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 2 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 3 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Figure 4 for Progressive Mixed-Precision Decoding for Efficient LLM Inference
Viaarxiv icon

CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads

Add code
Sep 02, 2024
Figure 1 for CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads
Figure 2 for CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads
Figure 3 for CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads
Figure 4 for CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads
Viaarxiv icon

Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference

Add code
May 28, 2024
Figure 1 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Figure 2 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Figure 3 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Figure 4 for Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Viaarxiv icon

LifeLearner: Hardware-Aware Meta Continual Learning System for Embedded Computing Platforms

Add code
Nov 19, 2023
Viaarxiv icon

Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads

Add code
Oct 17, 2023
Viaarxiv icon

Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation

Add code
Jul 25, 2023
Figure 1 for Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation
Figure 2 for Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation
Figure 3 for Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation
Figure 4 for Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights Generation
Viaarxiv icon

TinyTrain: Deep Neural Network Training at the Extreme Edge

Add code
Jul 19, 2023
Viaarxiv icon

MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at the Consumer Edge

Add code
Jun 22, 2023
Viaarxiv icon