Picture for Jing Pan

Jing Pan

School of Art, Design and Architecture, Monash University, Melbourne, Australia

Equipping LLM with Directional Multi-Talker Speech Understanding Capabilities

Add code
Feb 06, 2026
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Figure 1 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 2 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 3 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 4 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Viaarxiv icon

Target word activity detector: An approach to obtain ASR word boundaries without lexicon

Add code
Sep 20, 2024
Viaarxiv icon

CogniDual Framework: Self-Training Large Language Models within a Dual-System Theoretical Framework for Improving Cognitive Tasks

Add code
Sep 05, 2024
Figure 1 for CogniDual Framework: Self-Training Large Language Models within a Dual-System Theoretical Framework for Improving Cognitive Tasks
Figure 2 for CogniDual Framework: Self-Training Large Language Models within a Dual-System Theoretical Framework for Improving Cognitive Tasks
Viaarxiv icon

Promoting Equality in Large Language Models: Identifying and Mitigating the Implicit Bias based on Bayesian Theory

Add code
Aug 20, 2024
Viaarxiv icon

WavLLM: Towards Robust and Adaptive Speech Large Language Model

Add code
Mar 31, 2024
Viaarxiv icon

COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

Add code
Nov 03, 2023
Figure 1 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Figure 2 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Figure 3 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Figure 4 for COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning
Viaarxiv icon

Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach

Add code
Oct 06, 2023
Figure 1 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 2 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 3 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Figure 4 for Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach
Viaarxiv icon

E-Branchformer: Branchformer with Enhanced merging for speech recognition

Add code
Sep 30, 2022
Figure 1 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 2 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 3 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 4 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Viaarxiv icon

SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

Add code
Oct 11, 2021
Figure 1 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Figure 2 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Figure 3 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Figure 4 for SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition
Viaarxiv icon