Picture for Jianing Yang

Jianing Yang

LIFT-GS: Cross-Scene Render-Supervised Distillation for 3D Language Grounding

Add code
Feb 27, 2025
Viaarxiv icon

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Add code
Jan 23, 2025
Viaarxiv icon

Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use

Add code
Oct 31, 2024
Viaarxiv icon

Multi-Object Hallucination in Vision-Language Models

Add code
Jul 08, 2024
Figure 1 for Multi-Object Hallucination in Vision-Language Models
Figure 2 for Multi-Object Hallucination in Vision-Language Models
Figure 3 for Multi-Object Hallucination in Vision-Language Models
Figure 4 for Multi-Object Hallucination in Vision-Language Models
Viaarxiv icon

Improving Autoregressive Training with Dynamic Oracles

Add code
Jun 13, 2024
Viaarxiv icon

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Add code
Jun 12, 2024
Figure 1 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 2 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 3 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Figure 4 for 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Viaarxiv icon

3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs

Add code
Jun 07, 2024
Figure 1 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Figure 2 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Figure 3 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Figure 4 for 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs
Viaarxiv icon

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Add code
Sep 21, 2023
Viaarxiv icon

DANLI: Deliberative Agent for Following Natural Language Instructions

Add code
Oct 22, 2022
Viaarxiv icon

MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences

Add code
Oct 22, 2020
Figure 1 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences
Figure 2 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences
Figure 3 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences
Figure 4 for MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences
Viaarxiv icon