Video Semantic Segmentation


Video semantic segmentation is the process of segmenting objects in videos into different classes or categories.

Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos

Add code
Jan 21, 2025
Viaarxiv icon

Scaling up self-supervised learning for improved surgical foundation models

Add code
Jan 16, 2025
Viaarxiv icon

Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks

Add code
Jan 17, 2025
Viaarxiv icon

Aligning First, Then Fusing: A Novel Weakly Supervised Multimodal Violence Detection Method

Add code
Jan 13, 2025
Viaarxiv icon

AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation

Add code
Jan 14, 2025
Figure 1 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 2 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 3 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 4 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Viaarxiv icon

OCORD: Open-Campus Object Removal Dataset

Add code
Jan 13, 2025
Viaarxiv icon

Multilevel Semantic-Aware Model for AI-Generated Video Quality Assessment

Add code
Jan 06, 2025
Figure 1 for Multilevel Semantic-Aware Model for AI-Generated Video Quality Assessment
Figure 2 for Multilevel Semantic-Aware Model for AI-Generated Video Quality Assessment
Figure 3 for Multilevel Semantic-Aware Model for AI-Generated Video Quality Assessment
Figure 4 for Multilevel Semantic-Aware Model for AI-Generated Video Quality Assessment
Viaarxiv icon

LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training

Add code
Dec 30, 2024
Figure 1 for LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training
Figure 2 for LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training
Figure 3 for LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training
Figure 4 for LiDAR-Camera Fusion for Video Panoptic Segmentation without Video Training
Viaarxiv icon

PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM

Add code
Dec 31, 2024
Figure 1 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Figure 2 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Figure 3 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Figure 4 for PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Viaarxiv icon

PG-SAG: Parallel Gaussian Splatting for Fine-Grained Large-Scale Urban Buildings Reconstruction via Semantic-Aware Grouping

Add code
Jan 03, 2025
Viaarxiv icon