Picture for Feng Cheng

Feng Cheng

Helen

TimeRefine: Temporal Grounding with Time Refining Video LLM

Add code
Dec 12, 2024
Viaarxiv icon

Towards Automated Model Design on Recommender Systems

Add code
Nov 12, 2024
Viaarxiv icon

A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models

Add code
Oct 08, 2024
Viaarxiv icon

Latent Diffusion Model-Enabled Real-Time Semantic Communication Considering Semantic Ambiguities and Channel Noises

Add code
Jun 09, 2024
Viaarxiv icon

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

Add code
May 29, 2024
Figure 1 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Figure 2 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Figure 3 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Figure 4 for VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
Viaarxiv icon

DAM: Dynamic Adapter Merging for Continual Video QA Learning

Add code
Mar 13, 2024
Figure 1 for DAM: Dynamic Adapter Merging for Continual Video QA Learning
Figure 2 for DAM: Dynamic Adapter Merging for Continual Video QA Learning
Figure 3 for DAM: Dynamic Adapter Merging for Continual Video QA Learning
Figure 4 for DAM: Dynamic Adapter Merging for Continual Video QA Learning
Viaarxiv icon

Large Language Models in Cybersecurity: State-of-the-Art

Add code
Jan 30, 2024
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Nov 30, 2023
Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Unified Coarse-to-Fine Alignment for Video-Text Retrieval

Add code
Sep 18, 2023
Figure 1 for Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Figure 2 for Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Figure 3 for Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Figure 4 for Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Viaarxiv icon

LoCoNet: Long-Short Context Network for Active Speaker Detection

Add code
Jan 19, 2023
Figure 1 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Figure 2 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Figure 3 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Figure 4 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Viaarxiv icon