Picture for Wonyong Sung

Wonyong Sung

Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization

Add code
Nov 09, 2023
Viaarxiv icon

Token-Scaled Logit Distillation for Ternary Weight Generative Language Models

Add code
Aug 13, 2023
Viaarxiv icon

Teacher Intervention: Improving Convergence of Quantization Aware Training for Ultra-Low Precision Transformers

Add code
Feb 23, 2023
Viaarxiv icon

Sleep Model -- A Sequence Model for Predicting the Next Sleep Stage

Add code
Feb 17, 2023
Viaarxiv icon

Exploring Attention Map Reuse for Efficient Transformer Neural Networks

Add code
Jan 29, 2023
Viaarxiv icon

Macro-block dropout for improved regularization in training end-to-end speech recognition models

Add code
Dec 29, 2022
Viaarxiv icon

A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition

Add code
Oct 01, 2022
Figure 1 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Figure 2 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Figure 3 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Figure 4 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Viaarxiv icon

Korean Tokenization for Beam Search Rescoring in Speech Recognition

Add code
Mar 28, 2022
Figure 1 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 2 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 3 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 4 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Viaarxiv icon

Similarity and Content-based Phonetic Self Attention for Speech Recognition

Add code
Mar 28, 2022
Figure 1 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Figure 2 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Figure 3 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Figure 4 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Viaarxiv icon

Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling

Add code
Oct 07, 2021
Figure 1 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Figure 2 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Figure 3 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Figure 4 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Viaarxiv icon