Picture for Kun Zhou

Kun Zhou

Do we Really Need Visual Instructions? Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models

Add code
Feb 17, 2025
Viaarxiv icon

Conditional Latent Diffusion-Based Speech Enhancement Via Dual Context Learning

Add code
Jan 17, 2025
Viaarxiv icon

HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution

Add code
Jan 17, 2025
Viaarxiv icon

YuLan-Mini: An Open Data-efficient Language Model

Add code
Dec 24, 2024
Figure 1 for YuLan-Mini: An Open Data-efficient Language Model
Figure 2 for YuLan-Mini: An Open Data-efficient Language Model
Figure 3 for YuLan-Mini: An Open Data-efficient Language Model
Figure 4 for YuLan-Mini: An Open Data-efficient Language Model
Viaarxiv icon

Hierarchical Control of Emotion Rendering in Speech Synthesis

Add code
Dec 17, 2024
Figure 1 for Hierarchical Control of Emotion Rendering in Speech Synthesis
Figure 2 for Hierarchical Control of Emotion Rendering in Speech Synthesis
Figure 3 for Hierarchical Control of Emotion Rendering in Speech Synthesis
Figure 4 for Hierarchical Control of Emotion Rendering in Speech Synthesis
Viaarxiv icon

RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector

Add code
Dec 13, 2024
Figure 1 for RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector
Figure 2 for RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector
Figure 3 for RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector
Figure 4 for RETQA: A Large-Scale Open-Domain Tabular Question Answering Dataset for Real Estate Sector
Viaarxiv icon

MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers

Add code
Dec 04, 2024
Figure 1 for MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers
Figure 2 for MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers
Figure 3 for MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers
Figure 4 for MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers
Viaarxiv icon

Enhancing Visual Reasoning with Autonomous Imagination in Multimodal Large Language Models

Add code
Nov 27, 2024
Viaarxiv icon

ARM: Appearance Reconstruction Model for Relightable 3D Generation

Add code
Nov 16, 2024
Viaarxiv icon

Self-Calibrated Listwise Reranking with Large Language Models

Add code
Nov 07, 2024
Figure 1 for Self-Calibrated Listwise Reranking with Large Language Models
Figure 2 for Self-Calibrated Listwise Reranking with Large Language Models
Figure 3 for Self-Calibrated Listwise Reranking with Large Language Models
Figure 4 for Self-Calibrated Listwise Reranking with Large Language Models
Viaarxiv icon