Picture for Ya Jing

Ya Jing

GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation

Add code
Oct 08, 2024
Viaarxiv icon

Knowledge Boundary and Persona Dynamic Shape A Better Social Media Agent

Add code
Apr 02, 2024
Viaarxiv icon

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

Add code
Dec 21, 2023
Viaarxiv icon

Vision-Language Foundation Models as Effective Robot Imitators

Add code
Nov 06, 2023
Viaarxiv icon

MOMA-Force: Visual-Force Imitation for Real-World Mobile Manipulation

Add code
Aug 07, 2023
Viaarxiv icon

Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods

Add code
Aug 07, 2023
Viaarxiv icon

Learning to Explore Informative Trajectories and Samples for Embodied Perception

Add code
Mar 20, 2023
Viaarxiv icon

Towards Unifying Reference Expression Generation and Comprehension

Add code
Oct 24, 2022
Viaarxiv icon

Locate then Segment: A Strong Pipeline for Referring Image Segmentation

Add code
Mar 30, 2021
Figure 1 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Figure 2 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Figure 3 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Figure 4 for Locate then Segment: A Strong Pipeline for Referring Image Segmentation
Viaarxiv icon

Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection

Add code
Sep 22, 2018
Figure 1 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Figure 2 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Figure 3 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Figure 4 for Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Viaarxiv icon