Picture for Mahmoud Ahmed

Mahmoud Ahmed

Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding

Add code
May 29, 2024
Viaarxiv icon

The Right Losses for the Right Gains: Improving the Semantic Consistency of Deep Text-to-Image Generation with Distribution-Sensitive Losses

Add code
Dec 18, 2023
Viaarxiv icon

3DCoMPaT$^{++}$: An improved Large-scale 3D Vision Dataset for Compositional Recognition

Add code
Oct 27, 2023
Figure 1 for 3DCoMPaT$^{++}$: An improved Large-scale 3D Vision Dataset for Compositional Recognition
Figure 2 for 3DCoMPaT$^{++}$: An improved Large-scale 3D Vision Dataset for Compositional Recognition
Figure 3 for 3DCoMPaT$^{++}$: An improved Large-scale 3D Vision Dataset for Compositional Recognition
Figure 4 for 3DCoMPaT$^{++}$: An improved Large-scale 3D Vision Dataset for Compositional Recognition
Viaarxiv icon

CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding

Add code
Oct 10, 2023
Viaarxiv icon