Picture for Qiquan Zhang

Qiquan Zhang

Time-Graph Frequency Representation with Singular Value Decomposition for Neural Speech Enhancement

Add code
Dec 24, 2024
Viaarxiv icon

SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model

Add code
Nov 12, 2024
Figure 1 for SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model
Figure 2 for SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model
Figure 3 for SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model
Figure 4 for SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model
Viaarxiv icon

Selective State Space Model for Monaural Speech Enhancement

Add code
Nov 09, 2024
Figure 1 for Selective State Space Model for Monaural Speech Enhancement
Figure 2 for Selective State Space Model for Monaural Speech Enhancement
Figure 3 for Selective State Space Model for Monaural Speech Enhancement
Figure 4 for Selective State Space Model for Monaural Speech Enhancement
Viaarxiv icon

Binaural Selective Attention Model for Target Speaker Extraction

Add code
Jun 18, 2024
Figure 1 for Binaural Selective Attention Model for Target Speaker Extraction
Figure 2 for Binaural Selective Attention Model for Target Speaker Extraction
Figure 3 for Binaural Selective Attention Model for Target Speaker Extraction
Figure 4 for Binaural Selective Attention Model for Target Speaker Extraction
Viaarxiv icon

An Exploration of Length Generalization in Transformer-Based Speech Enhancement

Add code
Jun 17, 2024
Figure 1 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Figure 2 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Figure 3 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Figure 4 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Viaarxiv icon

Mamba in Speech: Towards an Alternative to Self-Attention

Add code
May 22, 2024
Figure 1 for Mamba in Speech: Towards an Alternative to Self-Attention
Figure 2 for Mamba in Speech: Towards an Alternative to Self-Attention
Figure 3 for Mamba in Speech: Towards an Alternative to Self-Attention
Figure 4 for Mamba in Speech: Towards an Alternative to Self-Attention
Viaarxiv icon

When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection

Add code
Feb 17, 2024
Figure 1 for When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection
Figure 2 for When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection
Figure 3 for When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection
Figure 4 for When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection
Viaarxiv icon

Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

Add code
Feb 16, 2024
Viaarxiv icon

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement

Add code
Jan 18, 2024
Viaarxiv icon

EEG-Derived Voice Signature for Attended Speaker Detection

Add code
Aug 28, 2023
Viaarxiv icon