Picture for Yida Zhao

Yida Zhao

WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

Add code
Sep 16, 2025
Viaarxiv icon

ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization

Add code
Sep 16, 2025
Viaarxiv icon

EvolveSearch: An Iterative Self-Evolving Search Agent

Add code
May 28, 2025
Viaarxiv icon

Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models

Add code
Jul 24, 2024
Viaarxiv icon

Progressive Learning for Image Retrieval with Hybrid-Modality Queries

Add code
Apr 24, 2022
Figure 1 for Progressive Learning for Image Retrieval with Hybrid-Modality Queries
Figure 2 for Progressive Learning for Image Retrieval with Hybrid-Modality Queries
Figure 3 for Progressive Learning for Image Retrieval with Hybrid-Modality Queries
Figure 4 for Progressive Learning for Image Retrieval with Hybrid-Modality Queries
Viaarxiv icon

WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training

Add code
Mar 19, 2021
Figure 1 for WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training
Figure 2 for WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training
Figure 3 for WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training
Figure 4 for WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training
Viaarxiv icon

The End-of-End-to-End: A Video Understanding Pentathlon Challenge

Add code
Aug 03, 2020
Figure 1 for The End-of-End-to-End: A Video Understanding Pentathlon Challenge
Figure 2 for The End-of-End-to-End: A Video Understanding Pentathlon Challenge
Figure 3 for The End-of-End-to-End: A Video Understanding Pentathlon Challenge
Figure 4 for The End-of-End-to-End: A Video Understanding Pentathlon Challenge
Viaarxiv icon

Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning

Add code
Jun 14, 2020
Figure 1 for Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning
Figure 2 for Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning
Figure 3 for Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning
Figure 4 for Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning
Viaarxiv icon

Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning

Add code
Mar 01, 2020
Figure 1 for Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Figure 2 for Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Figure 3 for Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Figure 4 for Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Viaarxiv icon

Integrating Temporal and Spatial Attentions for VATEX Video Captioning Challenge 2019

Add code
Oct 15, 2019
Figure 1 for Integrating Temporal and Spatial Attentions for VATEX Video Captioning Challenge 2019
Viaarxiv icon