Picture for Minh Tran

Minh Tran

Discrete Facial Encoding: : A Framework for Data-driven Facial Display Discovery

Add code
Oct 02, 2025
Viaarxiv icon

Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice

Add code
Aug 24, 2025
Viaarxiv icon

A Robust BERT-Based Deep Learning Model for Automated Cancer Type Extraction from Unstructured Pathology Reports

Add code
Aug 21, 2025
Viaarxiv icon

DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion

Add code
Apr 05, 2025
Viaarxiv icon

Negative to Positive Co-learning with Aggressive Modality Dropout

Add code
Jan 01, 2025
Viaarxiv icon

A2VIS: Amodal-Aware Approach to Video Instance Segmentation

Add code
Dec 02, 2024
Figure 1 for A2VIS: Amodal-Aware Approach to Video Instance Segmentation
Figure 2 for A2VIS: Amodal-Aware Approach to Video Instance Segmentation
Figure 3 for A2VIS: Amodal-Aware Approach to Video Instance Segmentation
Figure 4 for A2VIS: Amodal-Aware Approach to Video Instance Segmentation
Viaarxiv icon

Amodal Instance Segmentation with Diffusion Shape Prior Estimation

Add code
Sep 26, 2024
Viaarxiv icon

HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model

Add code
Jun 01, 2024
Viaarxiv icon

S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

Add code
May 07, 2024
Viaarxiv icon

CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect

Add code
Apr 17, 2024
Viaarxiv icon