Picture for Qin Jin

Qin Jin

Renmin University of China

Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models

Add code
Oct 04, 2024
Viaarxiv icon

Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues

Add code
Sep 29, 2024
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm

Add code
Sep 11, 2024
Viaarxiv icon

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

Add code
Sep 05, 2024
Figure 1 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 2 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 3 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Figure 4 for mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Viaarxiv icon

What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation

Add code
Aug 26, 2024
Viaarxiv icon

QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds

Add code
Jun 24, 2024
Viaarxiv icon

UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos

Add code
Jun 24, 2024
Viaarxiv icon

SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models

Add code
Jun 20, 2024
Viaarxiv icon

SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction

Add code
Jun 16, 2024
Viaarxiv icon