Abstract:Zero-shot learning offers an efficient solution for a machine learning model to treat unseen categories, avoiding exhaustive data collection. Zero-shot Sketch-based Image Retrieval (ZS-SBIR) simulates real-world scenarios where it is hard and costly to collect paired sketch-photo samples. We propose a novel framework that indirectly aligns sketches and photos by contrasting them through texts, removing the necessity of access to sketch-photo pairs. With an explicit modality encoding learned from data, our approach disentangles modality-agnostic semantics from modality-specific information, bridging the modality gap and enabling effective cross-modal content retrieval within a joint latent space. From comprehensive experiments, we verify the efficacy of the proposed model on ZS-SBIR, and it can be also applied to generalized and fine-grained settings.
Abstract:The field of dentistry is in the era of digital transformation. Particularly, artificial intelligence is anticipated to play a significant role in digital dentistry. AI holds the potential to significantly assist dental practitioners and elevate diagnostic accuracy. In alignment with this vision, the 2023 MICCAI DENTEX challenge aims to enhance the performance of dental panoramic X-ray diagnosis and enumeration through technological advancement. In response, we introduce DETDet, a Dual Ensemble Teeth Detection network. DETDet encompasses two distinct modules dedicated to enumeration and diagnosis. Leveraging the advantages of teeth mask data, we employ Mask-RCNN for the enumeration module. For the diagnosis module, we adopt an ensemble model comprising DiffusionDet and DINO. To further enhance precision scores, we integrate a complementary module to harness the potential of unlabeled data. The code for our approach will be made accessible at https://github.com/Bestever-choi/Evident