Picture for Pengzhen Ren

Pengzhen Ren

InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction

Add code
Dec 08, 2024
Viaarxiv icon

PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation

Add code
Oct 14, 2024
Figure 1 for PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
Figure 2 for PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
Figure 3 for PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
Figure 4 for PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
Viaarxiv icon

OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

Add code
Jul 10, 2024
Figure 1 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Figure 2 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Figure 3 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Figure 4 for OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
Viaarxiv icon

MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation

Add code
Aug 09, 2023
Viaarxiv icon

RM-PRT: Realistic Robotic Manipulation Simulator and Benchmark with Progressive Reasoning Tasks

Add code
Jun 21, 2023
Viaarxiv icon

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

Add code
Mar 15, 2023
Figure 1 for CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Figure 2 for CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Figure 3 for CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Figure 4 for CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
Viaarxiv icon

Beyond Fixation: Dynamic Window Visual Transformer

Add code
Apr 08, 2022
Figure 1 for Beyond Fixation: Dynamic Window Visual Transformer
Figure 2 for Beyond Fixation: Dynamic Window Visual Transformer
Figure 3 for Beyond Fixation: Dynamic Window Visual Transformer
Figure 4 for Beyond Fixation: Dynamic Window Visual Transformer
Viaarxiv icon

Person Search Challenges and Solutions: A Survey

Add code
May 01, 2021
Figure 1 for Person Search Challenges and Solutions: A Survey
Figure 2 for Person Search Challenges and Solutions: A Survey
Figure 3 for Person Search Challenges and Solutions: A Survey
Figure 4 for Person Search Challenges and Solutions: A Survey
Viaarxiv icon

Scene Graphs: A Survey of Generations and Applications

Add code
Mar 17, 2021
Figure 1 for Scene Graphs: A Survey of Generations and Applications
Figure 2 for Scene Graphs: A Survey of Generations and Applications
Figure 3 for Scene Graphs: A Survey of Generations and Applications
Figure 4 for Scene Graphs: A Survey of Generations and Applications
Viaarxiv icon

NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition

Add code
Mar 17, 2021
Figure 1 for NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition
Figure 2 for NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition
Figure 3 for NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition
Figure 4 for NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition
Viaarxiv icon