Abstract:Audio-language models (ALMs) process sounds to provide a linguistic description of sound-producing events and scenes. Recent advances in computing power and dataset creation have led to significant progress in this domain. This paper surveys existing datasets used for training audio-language models, emphasizing the recent trend towards using large, diverse datasets to enhance model performance. Key sources of these datasets include the Freesound platform and AudioSet that have contributed to the field's rapid growth. Although prior surveys primarily address techniques and training details, this survey categorizes and evaluates a wide array of datasets, addressing their origins, characteristics, and use cases. It also performs a data leak analysis to ensure dataset integrity and mitigate bias between datasets. This survey was conducted by analyzing research papers up to and including December 2023, and does not contain any papers after that period.
Abstract:Coronary artery disease (CAD) is one of the primary causes leading to death worldwide. Accurate extraction of individual arterial branches on invasive coronary angiograms (ICA) is important for stenosis detection and CAD diagnosis. However, deep learning-based models face challenges in generating semantic segmentation for coronary arteries due to the morphological similarity among different types of coronary arteries. To address this challenge, we propose an innovative approach using the hyper association graph-matching neural network with uncertainty quantification (HAGMN-UQ) for coronary artery semantic labeling on ICAs. The graph-matching procedure maps the arterial branches between two individual graphs, so that the unlabeled arterial segments are classified by the labeled segments, and the coronary artery semantic labeling is achieved. By incorporating the anatomical structural loss and uncertainty, our model achieved an accuracy of 0.9345 for coronary artery semantic labeling with a fast inference speed, leading to an effective and efficient prediction in real-time clinical decision-making scenarios.
Abstract:Semantic labeling of coronary arterial segments in invasive coronary angiography (ICA) is important for automated assessment and report generation of coronary artery stenosis in the computer-aided diagnosis of coronary artery disease (CAD). Inspired by the training procedure of interventional cardiologists for interpreting the structure of coronary arteries, we propose an association graph-based graph matching network (AGMN) for coronary arterial semantic labeling. We first extract the vascular tree from invasive coronary angiography (ICA) and convert it into multiple individual graphs. Then, an association graph is constructed from two individual graphs where each vertex represents the relationship between two arterial segments. Using the association graph, the AGMN extracts the vertex features by the embedding module, aggregates the features from adjacent vertices and edges by graph convolution network, and decodes the features to generate the semantic mappings between arteries. By learning the mapping of arterial branches between two individual graphs, the unlabeled arterial segments are classified by the labeled segments to achieve semantic labeling. A dataset containing 263 ICAs was employed to train and validate the proposed model, and a five-fold cross-validation scheme was performed. Our AGMN model achieved an average accuracy of 0.8264, an average precision of 0.8276, an average recall of 0.8264, and an average F1-score of 0.8262, which significantly outperformed existing coronary artery semantic labeling methods. In conclusion, we have developed and validated a new algorithm with high accuracy, interpretability, and robustness for coronary artery semantic labeling on ICAs.