Picture for Sijie Cheng

Sijie Cheng

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI

Add code
Oct 15, 2024
Viaarxiv icon

Instruction-Guided Visual Masking

Add code
May 30, 2024
Figure 1 for Instruction-Guided Visual Masking
Figure 2 for Instruction-Guided Visual Masking
Figure 3 for Instruction-Guided Visual Masking
Figure 4 for Instruction-Guided Visual Masking
Viaarxiv icon

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Add code
May 24, 2024
Viaarxiv icon

StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models

Add code
Mar 13, 2024
Viaarxiv icon

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

Add code
Feb 28, 2024
Viaarxiv icon

DEEM: Dynamic Experienced Expert Modeling for Stance Detection

Add code
Feb 23, 2024
Viaarxiv icon

Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language Conversion for Language Models

Add code
Jan 22, 2024
Viaarxiv icon

Can Vision-Language Models Think from a First-Person Perspective?

Add code
Nov 27, 2023
Viaarxiv icon

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Add code
Sep 20, 2023
Viaarxiv icon

Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks

Add code
May 28, 2023
Viaarxiv icon