Picture for Xinhan Di

Xinhan Di

Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search

Add code
Jan 02, 2025
Viaarxiv icon

Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio

Add code
Dec 23, 2024
Viaarxiv icon

Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning

Add code
Dec 23, 2024
Figure 1 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Figure 2 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Figure 3 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Figure 4 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Viaarxiv icon

Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models

Add code
Dec 13, 2024
Figure 1 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Figure 2 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Figure 3 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Figure 4 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Viaarxiv icon

YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls

Add code
Dec 12, 2024
Figure 1 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Figure 2 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Figure 3 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Figure 4 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Viaarxiv icon

Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders

Add code
Oct 07, 2024
Figure 1 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Figure 2 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Figure 3 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Figure 4 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Viaarxiv icon

OCC-MLLM:Empowering Multimodal Large Language Model For the Understanding of Occluded Objects

Add code
Oct 02, 2024
Viaarxiv icon

OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning

Add code
Oct 02, 2024
Figure 1 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Figure 2 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Figure 3 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Figure 4 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Viaarxiv icon

Towards Full-parameter and Parameter-efficient Self-learning For Endoscopic Camera Depth Estimation

Add code
Oct 01, 2024
Viaarxiv icon

Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation

Add code
Sep 26, 2024
Figure 1 for Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
Figure 2 for Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
Figure 3 for Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
Figure 4 for Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
Viaarxiv icon