Picture for Yanpeng Sun

Yanpeng Sun

Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs

Add code
Jan 11, 2025
Viaarxiv icon

Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception

Add code
Dec 18, 2024
Figure 1 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Figure 2 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Figure 3 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Figure 4 for Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception
Viaarxiv icon

Continual SFT Matches Multimodal RLHF with Negative Supervision

Add code
Nov 22, 2024
Figure 1 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 2 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 3 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 4 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Viaarxiv icon

Improving Multi-modal Large Language Model through Boosting Vision Capabilities

Add code
Oct 17, 2024
Figure 1 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 2 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 3 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 4 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Viaarxiv icon

CSGO: Content-Style Composition in Text-to-Image Generation

Add code
Sep 04, 2024
Figure 1 for CSGO: Content-Style Composition in Text-to-Image Generation
Figure 2 for CSGO: Content-Style Composition in Text-to-Image Generation
Figure 3 for CSGO: Content-Style Composition in Text-to-Image Generation
Figure 4 for CSGO: Content-Style Composition in Text-to-Image Generation
Viaarxiv icon

VRP-SAM: SAM with Visual Reference Prompt

Add code
Feb 27, 2024
Figure 1 for VRP-SAM: SAM with Visual Reference Prompt
Figure 2 for VRP-SAM: SAM with Visual Reference Prompt
Figure 3 for VRP-SAM: SAM with Visual Reference Prompt
Figure 4 for VRP-SAM: SAM with Visual Reference Prompt
Viaarxiv icon

Exploring Effective Factors for Improving Visual In-Context Learning

Add code
Apr 10, 2023
Viaarxiv icon

Self-Supervised Guided Segmentation Framework for Unsupervised Anomaly Detection

Add code
Sep 26, 2022
Figure 1 for Self-Supervised Guided Segmentation Framework for Unsupervised Anomaly Detection
Figure 2 for Self-Supervised Guided Segmentation Framework for Unsupervised Anomaly Detection
Figure 3 for Self-Supervised Guided Segmentation Framework for Unsupervised Anomaly Detection
Figure 4 for Self-Supervised Guided Segmentation Framework for Unsupervised Anomaly Detection
Viaarxiv icon

Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning

Add code
Jun 13, 2022
Figure 1 for Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning
Figure 2 for Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning
Figure 3 for Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning
Figure 4 for Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning
Viaarxiv icon

SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost

Add code
Nov 05, 2021
Figure 1 for SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost
Figure 2 for SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost
Figure 3 for SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost
Figure 4 for SSA: Semantic Structure Aware Inference for Weakly Pixel-Wise Dense Predictions without Cost
Viaarxiv icon