Picture for Qian Yang

Qian Yang

Assessing and Learning Alignment of Unimodal Vision and Language Models

Add code
Dec 05, 2024
Viaarxiv icon

WavChat: A Survey of Spoken Dialogue Models

Add code
Nov 26, 2024
Viaarxiv icon

Thoughtful Adoption of NLP for Civic Participation: Understanding Differences Among Policymakers

Add code
Oct 30, 2024
Viaarxiv icon

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

Add code
Aug 29, 2024
Figure 1 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Figure 2 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Figure 3 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Figure 4 for WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Viaarxiv icon

MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis

Add code
Jul 19, 2024
Figure 1 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Figure 2 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Figure 3 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Figure 4 for MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Viaarxiv icon

Qwen2-Audio Technical Report

Add code
Jul 15, 2024
Viaarxiv icon

Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison

Add code
Jul 10, 2024
Viaarxiv icon

CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification

Add code
Apr 30, 2024
Viaarxiv icon

A Piece of Theatre: Investigating How Teachers Design LLM Chatbots to Assist Adolescent Cyberbullying Education

Add code
Feb 27, 2024
Viaarxiv icon

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Add code
Feb 12, 2024
Viaarxiv icon