Picture for Xinhan Di

Xinhan Di

Attentional Triple-Encoder Network in Spatiospectral Domains for Medical Image Segmentation

Add code
Mar 20, 2025
Viaarxiv icon

Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search

Add code
Jan 02, 2025
Figure 1 for Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search
Figure 2 for Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search
Figure 3 for Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search
Viaarxiv icon

Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning

Add code
Dec 23, 2024
Figure 1 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Figure 2 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Figure 3 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Figure 4 for Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning
Viaarxiv icon

Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio

Add code
Dec 23, 2024
Figure 1 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Figure 2 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Figure 3 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Figure 4 for Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio
Viaarxiv icon

Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models

Add code
Dec 13, 2024
Figure 1 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Figure 2 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Figure 3 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Figure 4 for Low-Rank Adaptation with Task-Relevant Feature Enhancement for Fine-tuning Language Models
Viaarxiv icon

YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls

Add code
Dec 12, 2024
Figure 1 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Figure 2 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Figure 3 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Figure 4 for YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls
Viaarxiv icon

Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders

Add code
Oct 07, 2024
Figure 1 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Figure 2 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Figure 3 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Figure 4 for Multi-Stage Graph Learning for fMRI Analysis to Diagnose Neuro-Developmental Disorders
Viaarxiv icon

OCC-MLLM:Empowering Multimodal Large Language Model For the Understanding of Occluded Objects

Add code
Oct 02, 2024
Viaarxiv icon

OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning

Add code
Oct 02, 2024
Figure 1 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Figure 2 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Figure 3 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Figure 4 for OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning
Viaarxiv icon

Towards Full-parameter and Parameter-efficient Self-learning For Endoscopic Camera Depth Estimation

Add code
Oct 01, 2024
Viaarxiv icon