Picture for Wenyong Huang

Wenyong Huang

YODA: Teacher-Student Progressive Learning for Language Models

Add code
Jan 28, 2024
Viaarxiv icon

Improving End-to-End Speech Processing by Efficient Text Data Utilization with Latent Synthesis

Add code
Oct 24, 2023
Viaarxiv icon

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

Add code
Oct 20, 2023
Viaarxiv icon

SELF: Language-Driven Self-Evolution for Large Language Model

Add code
Oct 07, 2023
Viaarxiv icon

Aligning Large Language Models with Human: A Survey

Add code
Jul 24, 2023
Viaarxiv icon

SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training

Add code
Jan 29, 2022
Figure 1 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Figure 2 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Figure 3 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Figure 4 for SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
Viaarxiv icon

CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis

Add code
Nov 16, 2021
Figure 1 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Figure 2 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Figure 3 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Figure 4 for CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis
Viaarxiv icon

Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition

Add code
Aug 13, 2020
Figure 1 for Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Figure 2 for Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Figure 3 for Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Figure 4 for Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition
Viaarxiv icon

NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Add code
Sep 05, 2019
Figure 1 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding
Figure 2 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding
Figure 3 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding
Figure 4 for NEZHA: Neural Contextualized Representation for Chinese Language Understanding
Viaarxiv icon