Picture for Diandian Guo

Diandian Guo

Multi-View Incongruity Learning for Multimodal Sarcasm Detection

Add code
Dec 01, 2024
Viaarxiv icon

Can Multimodal Large Language Model Think Analogically?

Add code
Nov 02, 2024
Figure 1 for Can Multimodal Large Language Model Think Analogically?
Figure 2 for Can Multimodal Large Language Model Think Analogically?
Figure 3 for Can Multimodal Large Language Model Think Analogically?
Figure 4 for Can Multimodal Large Language Model Think Analogically?
Viaarxiv icon

Surgical Workflow Recognition and Blocking Effectiveness Detection in Laparoscopic Liver Resections with Pringle Maneuver

Add code
Aug 21, 2024
Viaarxiv icon

Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms

Add code
Apr 14, 2024
Figure 1 for Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
Figure 2 for Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
Figure 3 for Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
Figure 4 for Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
Viaarxiv icon

S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR

Add code
Feb 22, 2024
Figure 1 for S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR
Figure 2 for S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR
Figure 3 for S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR
Figure 4 for S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR
Viaarxiv icon

Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

Add code
Jan 27, 2024
Viaarxiv icon

A Semi-Paired Approach For Label-to-Image Translation

Add code
Jun 26, 2023
Viaarxiv icon

Towards Pragmatic Semantic Image Synthesis for Urban Scenes

Add code
May 16, 2023
Viaarxiv icon