Picture for Yongliang Shen

Yongliang Shen

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Add code
Jan 03, 2025
Figure 1 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 2 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 3 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 4 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Viaarxiv icon

MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation

Add code
Dec 28, 2024
Viaarxiv icon

GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation

Add code
Oct 15, 2024
Figure 1 for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation
Figure 2 for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation
Figure 3 for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation
Figure 4 for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation
Viaarxiv icon

Entering Real Social World! Benchmarking the Theory of Mind and Socialization Capabilities of LLMs from a First-person Perspective

Add code
Oct 08, 2024
Viaarxiv icon

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

Add code
Jul 10, 2024
Figure 1 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Figure 2 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Figure 3 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Figure 4 for Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Viaarxiv icon

TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind

Add code
Jul 01, 2024
Viaarxiv icon

Advancing Process Verification for Large Language Models via Tree-Based Preference Learning

Add code
Jun 29, 2024
Figure 1 for Advancing Process Verification for Large Language Models via Tree-Based Preference Learning
Figure 2 for Advancing Process Verification for Large Language Models via Tree-Based Preference Learning
Figure 3 for Advancing Process Verification for Large Language Models via Tree-Based Preference Learning
Figure 4 for Advancing Process Verification for Large Language Models via Tree-Based Preference Learning
Viaarxiv icon

Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

Add code
Feb 27, 2024
Figure 1 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Figure 2 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Figure 3 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Figure 4 for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
Viaarxiv icon

EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction

Add code
Jan 11, 2024
Viaarxiv icon

Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives

Add code
Jan 04, 2024
Figure 1 for Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
Figure 2 for Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
Figure 3 for Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
Figure 4 for Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
Viaarxiv icon