Picture for Yitian Yuan

Yitian Yuan

VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models

Add code
Oct 15, 2024
Viaarxiv icon

3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance

Add code
Jul 13, 2024
Viaarxiv icon

Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models

Add code
Jun 12, 2024
Viaarxiv icon

Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment

Add code
Dec 15, 2023
Viaarxiv icon

A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach

Add code
Mar 10, 2022
Figure 1 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Figure 2 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Figure 3 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Figure 4 for A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Viaarxiv icon

Controllable Video Captioning with an Exemplar Sentence

Add code
Dec 02, 2021
Figure 1 for Controllable Video Captioning with an Exemplar Sentence
Figure 2 for Controllable Video Captioning with an Exemplar Sentence
Figure 3 for Controllable Video Captioning with an Exemplar Sentence
Figure 4 for Controllable Video Captioning with an Exemplar Sentence
Viaarxiv icon

Syntax Customized Video Captioning by Imitating Exemplar Sentences

Add code
Dec 02, 2021
Figure 1 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Figure 2 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Figure 3 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Figure 4 for Syntax Customized Video Captioning by Imitating Exemplar Sentences
Viaarxiv icon

A Survey on Temporal Sentence Grounding in Videos

Add code
Sep 17, 2021
Figure 1 for A Survey on Temporal Sentence Grounding in Videos
Figure 2 for A Survey on Temporal Sentence Grounding in Videos
Figure 3 for A Survey on Temporal Sentence Grounding in Videos
Figure 4 for A Survey on Temporal Sentence Grounding in Videos
Viaarxiv icon

A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics

Add code
Jan 27, 2021
Figure 1 for A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics
Figure 2 for A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics
Figure 3 for A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics
Figure 4 for A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics
Viaarxiv icon

Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos

Add code
Oct 31, 2019
Figure 1 for Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Figure 2 for Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Figure 3 for Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Figure 4 for Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Viaarxiv icon