Picture for Sixun Dong

Sixun Dong

Michael

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

Add code
Jan 24, 2024
Viaarxiv icon

RoomDesigner: Encoding Anchor-latents for Style-consistent and Shape-compatible Indoor Scene Generation

Add code
Oct 16, 2023
Viaarxiv icon

Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos

Add code
Mar 28, 2023
Figure 1 for Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
Figure 2 for Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
Figure 3 for Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
Figure 4 for Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos
Viaarxiv icon

TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting

Add code
Apr 03, 2022
Figure 1 for TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Figure 2 for TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Figure 3 for TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Figure 4 for TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
Viaarxiv icon