Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shenqi Jing

MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining

Jan 27, 2025

Ruiqi Wu, Na Su, Chenran Zhang, Tengfei Ma, Tao Zhou, Zhiting Cui, Nianfeng Tang, Tianyu Mao, Yi Zhou, Wen Fan(+3 more)

Figure 1 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining

Figure 2 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining

Figure 3 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining

Figure 4 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining

Abstract:Vision-language pretraining (VLP) has been investigated to generalize across diverse downstream tasks for fundus image analysis. Although recent methods showcase promising achievements, they significantly rely on large-scale private image-text data but pay less attention to the pretraining manner, which limits their further advancements. In this work, we introduce MM-Retinal V2, a high-quality image-text paired dataset comprising CFP, FFA, and OCT image modalities. Then, we propose a novel fundus vision-language pretraining model, namely KeepFIT V2, which is pretrained by integrating knowledge from the elite data spark into categorical public datasets. Specifically, a preliminary textual pretraining is adopted to equip the text encoder with primarily ophthalmic textual knowledge. Moreover, a hybrid image-text knowledge injection module is designed for knowledge transfer, which is essentially based on a combination of global semantic concepts from contrastive learning and local appearance details from generative learning. Extensive experiments across zero-shot, few-shot, and linear probing settings highlight the generalization and transferability of KeepFIT V2, delivering performance competitive to state-of-the-art fundus VLP models trained on large-scale private image-text datasets. Our dataset and model are publicly available via https://github.com/lxirich/MM-Retinal.

Via

Access Paper or Ask Questions

AsdKB: A Chinese Knowledge Base for the Early Screening and Diagnosis of Autism Spectrum Disorder

Aug 02, 2023

Tianxing Wu, Xudong Cao, Yipeng Zhu, Feiyue Wu, Tianling Gong, Yuxiang Wang, Shenqi Jing

Abstract:To easily obtain the knowledge about autism spectrum disorder and help its early screening and diagnosis, we create AsdKB, a Chinese knowledge base on autism spectrum disorder. The knowledge base is built on top of various sources, including 1) the disease knowledge from SNOMED CT and ICD-10 clinical descriptions on mental and behavioural disorders, 2) the diagnostic knowledge from DSM-5 and different screening tools recommended by social organizations and medical institutes, and 3) the expert knowledge on professional physicians and hospitals from the Web. AsdKB contains both ontological and factual knowledge, and is accessible as Linked Data at https://w3id.org/asdkb/. The potential applications of AsdKB are question answering, auxiliary diagnosis, and expert recommendation, and we illustrate them with a prototype which can be accessed at http://asdkb.org.cn/.

* 17 pages, Accepted by the Resource Track of ISWC 2023

Via

Access Paper or Ask Questions