Action Detection


$π_{0.5}$: a Vision-Language-Action Model with Open-World Generalization

Add code
Apr 22, 2025
Viaarxiv icon

Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection

Add code
Apr 20, 2025
Viaarxiv icon

Talk is Not Always Cheap: Promoting Wireless Sensing Models with Text Prompts

Add code
Apr 22, 2025
Viaarxiv icon

UFO2: The Desktop AgentOS

Add code
Apr 20, 2025
Viaarxiv icon

Real-Time Sentiment Insights from X Using VADER, DistilBERT, and Web-Scraped Data

Add code
Apr 21, 2025
Viaarxiv icon

Neglected Risks: The Disturbing Reality of Children's Images in Datasets and the Urgent Call for Accountability

Add code
Apr 20, 2025
Viaarxiv icon

Balancing Privacy and Action Performance: A Penalty-Driven Approach to Image Anonymization

Add code
Apr 19, 2025
Viaarxiv icon

Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization

Add code
Apr 18, 2025
Viaarxiv icon

Tackling Social Bias against the Poor: A Dataset and Taxonomy on Aporophobia

Add code
Apr 17, 2025
Viaarxiv icon

AskQE: Question Answering as Automatic Evaluation for Machine Translation

Add code
Apr 15, 2025
Viaarxiv icon