Picture for Wonyong Sung

Wonyong Sung

Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization

Add code
Nov 09, 2023
Viaarxiv icon

Token-Scaled Logit Distillation for Ternary Weight Generative Language Models

Add code
Aug 13, 2023
Viaarxiv icon

Teacher Intervention: Improving Convergence of Quantization Aware Training for Ultra-Low Precision Transformers

Add code
Feb 23, 2023
Viaarxiv icon

Sleep Model -- A Sequence Model for Predicting the Next Sleep Stage

Add code
Feb 17, 2023
Viaarxiv icon

Exploring Attention Map Reuse for Efficient Transformer Neural Networks

Add code
Jan 29, 2023
Figure 1 for Exploring Attention Map Reuse for Efficient Transformer Neural Networks
Figure 2 for Exploring Attention Map Reuse for Efficient Transformer Neural Networks
Figure 3 for Exploring Attention Map Reuse for Efficient Transformer Neural Networks
Figure 4 for Exploring Attention Map Reuse for Efficient Transformer Neural Networks
Viaarxiv icon

Macro-block dropout for improved regularization in training end-to-end speech recognition models

Add code
Dec 29, 2022
Figure 1 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Figure 2 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Figure 3 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Figure 4 for Macro-block dropout for improved regularization in training end-to-end speech recognition models
Viaarxiv icon

A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition

Add code
Oct 01, 2022
Figure 1 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Figure 2 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Figure 3 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Figure 4 for A Comparison of Transformer, Convolutional, and Recurrent Neural Networks on Phoneme Recognition
Viaarxiv icon

Korean Tokenization for Beam Search Rescoring in Speech Recognition

Add code
Mar 28, 2022
Figure 1 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 2 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 3 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 4 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Viaarxiv icon

Similarity and Content-based Phonetic Self Attention for Speech Recognition

Add code
Mar 28, 2022
Figure 1 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Figure 2 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Figure 3 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Figure 4 for Similarity and Content-based Phonetic Self Attention for Speech Recognition
Viaarxiv icon

Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling

Add code
Oct 07, 2021
Figure 1 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Figure 2 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Figure 3 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Figure 4 for Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling
Viaarxiv icon