Picture for Liangzhen Lai

Liangzhen Lai

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Add code
Apr 29, 2024
Viaarxiv icon

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Add code
Feb 22, 2024
Viaarxiv icon

Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition

Add code
Feb 20, 2024
Figure 1 for Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition
Figure 2 for Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition
Figure 3 for Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition
Figure 4 for Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition
Viaarxiv icon

Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition

Add code
Sep 21, 2023
Figure 1 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 2 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 3 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 4 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Viaarxiv icon

SDRM3: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads

Add code
Dec 07, 2022
Viaarxiv icon

XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse

Add code
Nov 16, 2022
Viaarxiv icon

Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation

Add code
Nov 23, 2021
Figure 1 for Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Figure 2 for Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Figure 3 for Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Figure 4 for Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
Viaarxiv icon

Low-Rank+Sparse Tensor Compression for Neural Networks

Add code
Nov 02, 2021
Figure 1 for Low-Rank+Sparse Tensor Compression for Neural Networks
Figure 2 for Low-Rank+Sparse Tensor Compression for Neural Networks
Figure 3 for Low-Rank+Sparse Tensor Compression for Neural Networks
Figure 4 for Low-Rank+Sparse Tensor Compression for Neural Networks
Viaarxiv icon

Improving Efficiency in Neural Network Accelerator Using Operands Hamming Distance optimization

Add code
Feb 13, 2020
Figure 1 for Improving Efficiency in Neural Network Accelerator Using Operands Hamming Distance optimization
Figure 2 for Improving Efficiency in Neural Network Accelerator Using Operands Hamming Distance optimization
Figure 3 for Improving Efficiency in Neural Network Accelerator Using Operands Hamming Distance optimization
Figure 4 for Improving Efficiency in Neural Network Accelerator Using Operands Hamming Distance optimization
Viaarxiv icon

Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks

Add code
Feb 10, 2020
Figure 1 for Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks
Figure 2 for Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks
Figure 3 for Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks
Figure 4 for Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks
Viaarxiv icon