Picture for Jianhua Tao

Jianhua Tao

Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking

Add code
Feb 04, 2025
Viaarxiv icon

DReSS: Data-driven Regularized Structured Streamlining for Large Language Models

Add code
Jan 29, 2025
Figure 1 for DReSS: Data-driven Regularized Structured Streamlining for Large Language Models
Figure 2 for DReSS: Data-driven Regularized Structured Streamlining for Large Language Models
Figure 3 for DReSS: Data-driven Regularized Structured Streamlining for Large Language Models
Figure 4 for DReSS: Data-driven Regularized Structured Streamlining for Large Language Models
Viaarxiv icon

MTPareto: A MultiModal Targeted Pareto Framework for Fake News Detection

Add code
Jan 12, 2025
Viaarxiv icon

BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement

Add code
Dec 26, 2024
Viaarxiv icon

Region-Based Optimization in Continual Learning for Audio Deepfake Detection

Add code
Dec 16, 2024
Figure 1 for Region-Based Optimization in Continual Learning for Audio Deepfake Detection
Figure 2 for Region-Based Optimization in Continual Learning for Audio Deepfake Detection
Figure 3 for Region-Based Optimization in Continual Learning for Audio Deepfake Detection
Figure 4 for Region-Based Optimization in Continual Learning for Audio Deepfake Detection
Viaarxiv icon

Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio

Add code
Dec 02, 2024
Figure 1 for Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio
Figure 2 for Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio
Figure 3 for Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio
Figure 4 for Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio
Viaarxiv icon

Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

Add code
Nov 27, 2024
Viaarxiv icon

LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis

Add code
Nov 24, 2024
Figure 1 for LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis
Figure 2 for LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis
Figure 3 for LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis
Figure 4 for LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis
Viaarxiv icon

DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection

Add code
Oct 15, 2024
Figure 1 for DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection
Figure 2 for DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection
Figure 3 for DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection
Figure 4 for DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection
Viaarxiv icon

WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification

Add code
Sep 18, 2024
Figure 1 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Figure 2 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Figure 3 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Figure 4 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Viaarxiv icon