Picture for Qinglin Zhang

Qinglin Zhang

OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

Add code
Oct 23, 2024
Figure 1 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Figure 2 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Figure 3 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Figure 4 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Viaarxiv icon

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

Add code
Aug 22, 2024
Viaarxiv icon

Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts

Add code
Aug 19, 2024
Viaarxiv icon

Multimodal Fusion and Coherence Modeling for Video Topic Segmentation

Add code
Aug 01, 2024
Viaarxiv icon

Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers

Add code
Jun 17, 2024
Figure 1 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Figure 2 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Figure 3 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Figure 4 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Viaarxiv icon

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

Add code
Jan 11, 2024
Figure 1 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Figure 2 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Figure 3 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Figure 4 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Viaarxiv icon

Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token Based ASR

Add code
Nov 08, 2023
Viaarxiv icon

Improving Long Document Topic Segmentation Models With Enhanced Coherence Modeling

Add code
Oct 23, 2023
Viaarxiv icon

Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation

Add code
Sep 19, 2023
Figure 1 for Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
Figure 2 for Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
Figure 3 for Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
Figure 4 for Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
Viaarxiv icon

Improving BERT with Hybrid Pooling Network and Drop Mask

Add code
Jul 14, 2023
Figure 1 for Improving BERT with Hybrid Pooling Network and Drop Mask
Figure 2 for Improving BERT with Hybrid Pooling Network and Drop Mask
Figure 3 for Improving BERT with Hybrid Pooling Network and Drop Mask
Figure 4 for Improving BERT with Hybrid Pooling Network and Drop Mask
Viaarxiv icon