Picture for Qinglin Zhang

Qinglin Zhang

Uni-Retrieval: A Multi-Style Retrieval Framework for STEM's Education

Add code
Feb 09, 2025
Viaarxiv icon

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Add code
Jan 10, 2025
Figure 1 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Figure 2 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Figure 3 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Figure 4 for MinMo: A Multimodal Large Language Model for Seamless Voice Interaction
Viaarxiv icon

OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

Add code
Oct 23, 2024
Figure 1 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Figure 2 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Figure 3 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Figure 4 for OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Viaarxiv icon

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

Add code
Aug 22, 2024
Figure 1 for Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization
Figure 2 for Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization
Figure 3 for Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization
Figure 4 for Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization
Viaarxiv icon

Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts

Add code
Aug 19, 2024
Viaarxiv icon

Multimodal Fusion and Coherence Modeling for Video Topic Segmentation

Add code
Aug 01, 2024
Viaarxiv icon

Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers

Add code
Jun 17, 2024
Figure 1 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Figure 2 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Figure 3 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Figure 4 for Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers
Viaarxiv icon

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

Add code
Jan 11, 2024
Figure 1 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Figure 2 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Figure 3 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Figure 4 for Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Viaarxiv icon

Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token Based ASR

Add code
Nov 08, 2023
Figure 1 for Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token Based ASR
Figure 2 for Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token Based ASR
Figure 3 for Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token Based ASR
Figure 4 for Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token Based ASR
Viaarxiv icon

Improving Long Document Topic Segmentation Models With Enhanced Coherence Modeling

Add code
Oct 23, 2023
Viaarxiv icon