Picture for Jiajun Deng

Jiajun Deng

Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR

Add code
Sep 13, 2024
Figure 1 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 2 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 3 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 4 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Viaarxiv icon

RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies

Add code
Jul 27, 2024
Figure 1 for RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies
Figure 2 for RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies
Figure 3 for RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies
Figure 4 for RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies
Viaarxiv icon

Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation

Add code
Jul 08, 2024
Figure 1 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Figure 2 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Figure 3 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Figure 4 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Viaarxiv icon

Described Spatial-Temporal Video Detection

Add code
Jul 08, 2024
Figure 1 for Described Spatial-Temporal Video Detection
Figure 2 for Described Spatial-Temporal Video Detection
Figure 3 for Described Spatial-Temporal Video Detection
Figure 4 for Described Spatial-Temporal Video Detection
Viaarxiv icon

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Add code
Jul 02, 2024
Viaarxiv icon

One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model

Add code
Jun 14, 2024
Viaarxiv icon

Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask

Add code
Jun 14, 2024
Figure 1 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 2 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 3 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 4 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Viaarxiv icon

Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition

Add code
Jun 14, 2024
Figure 1 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 2 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 3 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 4 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Viaarxiv icon

HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation

Add code
Mar 18, 2024
Viaarxiv icon

Agent3D-Zero: An Agent for Zero-shot 3D Understanding

Add code
Mar 18, 2024
Viaarxiv icon