Picture for Lidong Bing

Lidong Bing

ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark

Add code
Jan 09, 2025
Figure 1 for ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
Figure 2 for ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
Figure 3 for ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
Figure 4 for ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
Viaarxiv icon

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Add code
Jan 08, 2025
Viaarxiv icon

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Add code
Jan 03, 2025
Figure 1 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 2 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 3 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Figure 4 for 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Viaarxiv icon

M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework

Add code
Nov 09, 2024
Viaarxiv icon

Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents

Add code
Oct 25, 2024
Figure 1 for Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
Figure 2 for Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
Figure 3 for Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
Figure 4 for Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
Viaarxiv icon

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Add code
Oct 22, 2024
Figure 1 for Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Figure 2 for Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Figure 3 for Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Figure 4 for Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Viaarxiv icon

Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents

Add code
Oct 17, 2024
Figure 1 for Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents
Figure 2 for Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents
Figure 3 for Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents
Figure 4 for Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents
Viaarxiv icon

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Add code
Oct 16, 2024
Figure 1 for The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Figure 2 for The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Figure 3 for The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Figure 4 for The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Viaarxiv icon

Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective

Add code
Oct 16, 2024
Figure 1 for Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Figure 2 for Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Figure 3 for Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Figure 4 for Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
Viaarxiv icon

Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths

Add code
Oct 07, 2024
Viaarxiv icon