Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tsung-Ying Ho

A Continual Learning-driven Model for Accurate and Generalizable Segmentation of Clinically Comprehensive and Fine-grained Whole-body Anatomies in CT

Mar 16, 2025

Dazhou Guo, Zhanghexuan Ji, Yanzhou Su, Dandan Zheng, Heng Guo, Puyang Wang, Ke Yan, Yirui Wang, Qinji Yu, Zi Li(+24 more)

Abstract:Precision medicine in the quantitative management of chronic diseases and oncology would be greatly improved if the Computed Tomography (CT) scan of any patient could be segmented, parsed and analyzed in a precise and detailed way. However, there is no such fully annotated CT dataset with all anatomies delineated for training because of the exceptionally high manual cost, the need for specialized clinical expertise, and the time required to finish the task. To this end, we proposed a novel continual learning-driven CT model that can segment complete anatomies presented using dozens of previously partially labeled datasets, dynamically expanding its capacity to segment new ones without compromising previously learned organ knowledge. Existing multi-dataset approaches are not able to dynamically segment new anatomies without catastrophic forgetting and would encounter optimization difficulty or infeasibility when segmenting hundreds of anatomies across the whole range of body regions. Our single unified CT segmentation model, CL-Net, can highly accurately segment a clinically comprehensive set of 235 fine-grained whole-body anatomies. Composed of a universal encoder, multiple optimized and pruned decoders, CL-Net is developed using 13,952 CT scans from 20 public and 16 private high-quality partially labeled CT datasets of various vendors, different contrast phases, and pathologies. Extensive evaluation demonstrates that CL-Net consistently outperforms the upper limit of an ensemble of 36 specialist nnUNets trained per dataset with the complexity of 5% model size and significantly surpasses the segmentation accuracy of recent leading Segment Anything-style medical image foundation models by large margins. Our continual learning-driven CL-Net model would lay a solid foundation to facilitate many downstream tasks of oncology and chronic diseases using the most widely adopted CT imaging.

Via

Access Paper or Ask Questions

Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation

Sep 04, 2022

Yu-Jen Chen, Wei-Hsiang Shen, Hao-Wei Chung, Jing-Hao Chiu, Da-Cheng Juan, Tsung-Ying Ho, Chi-Tung Cheng, Meng-Lin Li, Tsung-Yi Ho

Figure 1 for Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation

Figure 2 for Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation

Figure 3 for Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation

Figure 4 for Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation

Abstract:Medical report generation is a challenging task since it is time-consuming and requires expertise from experienced radiologists. The goal of medical report generation is to accurately capture and describe the image findings. Previous works pretrain their visual encoding neural networks with large datasets in different domains, which cannot learn general visual representation in the specific medical domain. In this work, we propose a medical report generation framework that uses a contrastive learning approach to pretrain the visual encoder and requires no additional meta information. In addition, we adopt lung segmentation as an augmentation method in the contrastive learning framework. This segmentation guides the network to focus on encoding the visual feature within the lung region. Experimental results show that the proposed framework improves the performance and the quality of the generated medical reports both quantitatively and qualitatively.

Via

Access Paper or Ask Questions

Comprehensive and Clinically Accurate Head and Neck Organs at Risk Delineation via Stratified Deep Learning: A Large-scale Multi-Institutional Study

Nov 01, 2021

Dazhou Guo, Jia Ge, Xianghua Ye, Senxiang Yan, Yi Xin, Yuchen Song, Bing-shen Huang, Tsung-Min Hung, Zhuotun Zhu, Ling Peng(+15 more)

Figure 1 for Comprehensive and Clinically Accurate Head and Neck Organs at Risk Delineation via Stratified Deep Learning: A Large-scale Multi-Institutional Study

Figure 2 for Comprehensive and Clinically Accurate Head and Neck Organs at Risk Delineation via Stratified Deep Learning: A Large-scale Multi-Institutional Study

Figure 3 for Comprehensive and Clinically Accurate Head and Neck Organs at Risk Delineation via Stratified Deep Learning: A Large-scale Multi-Institutional Study

Figure 4 for Comprehensive and Clinically Accurate Head and Neck Organs at Risk Delineation via Stratified Deep Learning: A Large-scale Multi-Institutional Study

Abstract:Accurate organ at risk (OAR) segmentation is critical to reduce the radiotherapy post-treatment complications. Consensus guidelines recommend a set of more than 40 OARs in the head and neck (H&N) region, however, due to the predictable prohibitive labor-cost of this task, most institutions choose a substantially simplified protocol by delineating a smaller subset of OARs and neglecting the dose distributions associated with other OARs. In this work we propose a novel, automated and highly effective stratified OAR segmentation (SOARS) system using deep learning to precisely delineate a comprehensive set of 42 H&N OARs. SOARS stratifies 42 OARs into anchor, mid-level, and small & hard subcategories, with specifically derived neural network architectures for each category by neural architecture search (NAS) principles. We built SOARS models using 176 training patients in an internal institution and independently evaluated on 1327 external patients across six different institutions. It consistently outperformed other state-of-the-art methods by at least 3-5% in Dice score for each institutional evaluation (up to 36% relative error reduction in other metrics). More importantly, extensive multi-user studies evidently demonstrated that 98% of the SOARS predictions need only very minor or no revisions for direct clinical acceptance (saving 90% radiation oncologists workload), and their segmentation and dosimetric accuracy are within or smaller than the inter-user variation. These findings confirmed the strong clinical applicability of SOARS for the OAR delineation process in H&N cancer radiotherapy workflows, with improved efficiency, comprehensiveness, and quality.

Via

Access Paper or Ask Questions

Multi-institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume using planning-CT and FDG-PETCT

Oct 11, 2021

Xianghua Ye, Dazhou Guo, Chen-kan Tseng, Jia Ge, Tsung-Min Hung, Ping-Ching Pai, Yanping Ren, Lu Zheng, Xinli Zhu, Ling Peng(+15 more)

Figure 1 for Multi-institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume using planning-CT and FDG-PETCT

Figure 2 for Multi-institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume using planning-CT and FDG-PETCT

Figure 3 for Multi-institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume using planning-CT and FDG-PETCT

Figure 4 for Multi-institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume using planning-CT and FDG-PETCT

Abstract:Background: The current clinical workflow for esophageal gross tumor volume (GTV) contouring relies on manual delineation of high labor-costs and interuser variability. Purpose: To validate the clinical applicability of a deep learning (DL) multi-modality esophageal GTV contouring model, developed at 1 institution whereas tested at multiple ones. Methods and Materials: We collected 606 esophageal cancer patients from four institutions. 252 institution-1 patients had a treatment planning-CT (pCT) and a pair of diagnostic FDG-PETCT; 354 patients from other 3 institutions had only pCT. A two-streamed DL model for GTV segmentation was developed using pCT and PETCT scans of a 148 patient institution-1 subset. This built model had the flexibility of segmenting GTVs via only pCT or pCT+PETCT combined. For independent evaluation, the rest 104 institution-1 patients behaved as unseen internal testing, and 354 institutions 2-4 patients were used for external testing. We evaluated manual revision degrees by human experts to assess the contour-editing effort. The performance of the deep model was compared against 4 radiation oncologists in a multiuser study with 20 random external patients. Contouring accuracy and time were recorded for the pre-and post-DL assisted delineation process. Results: Our model achieved high segmentation accuracy in internal testing (mean Dice score: 0.81 using pCT and 0.83 using pCT+PET) and generalized well to external evaluation (mean DSC: 0.80). Expert assessment showed that the predicted contours of 88% patients need only minor or no revision. In multi-user evaluation, with the assistance of a deep model, inter-observer variation and required contouring time were reduced by 37.6% and 48.0%, respectively. Conclusions: Deep learning predicted GTV contours were in close agreement with the ground truth and could be adopted clinically with mostly minor or no changes.

* 36 pages, 10 figures

Via

Access Paper or Ask Questions

Interactive Radiotherapy Target Delineation with 3D-Fused Context Propagation

Dec 12, 2020

Chun-Hung Chao, Hsien-Tzu Cheng, Tsung-Ying Ho, Le Lu, Min Sun

Figure 1 for Interactive Radiotherapy Target Delineation with 3D-Fused Context Propagation

Figure 2 for Interactive Radiotherapy Target Delineation with 3D-Fused Context Propagation

Figure 3 for Interactive Radiotherapy Target Delineation with 3D-Fused Context Propagation

Figure 4 for Interactive Radiotherapy Target Delineation with 3D-Fused Context Propagation

Abstract:Gross tumor volume (GTV) delineation on tomography medical imaging is crucial for radiotherapy planning and cancer diagnosis. Convolutional neural networks (CNNs) has been predominated on automatic 3D medical segmentation tasks, including contouring the radiotherapy target given 3D CT volume. While CNNs may provide feasible outcome, in clinical scenario, double-check and prediction refinement by experts is still necessary because of CNNs' inconsistent performance on unexpected patient cases. To provide experts an efficient way to modify the CNN predictions without retrain the model, we propose 3D-fused context propagation, which propagates any edited slice to the whole 3D volume. By considering the high-level feature maps, the radiation oncologists would only required to edit few slices to guide the correction and refine the whole prediction volume. Specifically, we leverage the backpropagation for activation technique to convey the user editing information backwardly to the latent space and generate new prediction based on the updated and original feature. During the interaction, our proposed approach reuses the extant extracted features and does not alter the existing 3D CNN model architectures, avoiding the perturbation on other predictions. The proposed method is evaluated on two published radiotherapy target contouring datasets of nasopharyngeal and esophageal cancer. The experimental results demonstrate that our proposed method is able to further effectively improve the existing segmentation prediction from different model architectures given oncologists' interactive inputs.

Via

Access Paper or Ask Questions

Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

Aug 29, 2020

Chun-Hung Chao, Zhuotun Zhu, Dazhou Guo, Ke Yan, Tsung-Ying Ho, Jinzheng Cai, Adam P. Harrison, Xianghua Ye, Jing Xiao, Alan Yuille(+3 more)

Figure 1 for Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

Figure 2 for Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

Figure 3 for Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

Figure 4 for Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

Abstract:Determining the spread of GTV$_{LN}$ is essential in defining the respective resection or irradiating regions for the downstream workflows of surgical resection and radiotherapy for many cancers. Different from the more common enlarged lymph node (LN), GTV$_{LN}$ also includes smaller ones if associated with high positron emission tomography signals and/or any metastasis signs in CT. This is a daunting task. In this work, we propose a unified LN appearance and inter-LN relationship learning framework to detect the true GTV$_{LN}$. This is motivated by the prior clinical knowledge that LNs form a connected lymphatic system, and the spread of cancer cells among LNs often follows certain pathways. Specifically, we first utilize a 3D convolutional neural network with ROI-pooling to extract the GTV$_{LN}$'s instance-wise appearance features. Next, we introduce a graph neural network to further model the inter-LN relationships where the global LN-tumor spatial priors are included in the learning process. This leads to an end-to-end trainable network to detect by classifying GTV$_{LN}$. We operate our model on a set of GTV$_{LN}$ candidates generated by a preliminary 1st-stage method, which has a sensitivity of $>85\%$ at the cost of high false positive (FP) ($>15$ FPs per patient). We validate our approach on a radiotherapy dataset with 142 paired PET/RTCT scans containing the chest and upper abdominal body parts. The proposed method significantly improves over the state-of-the-art (SOTA) LN classification method by $5.5\%$ and $13.1\%$ in F1 score and the averaged sensitivity value at $2, 3, 4, 6$ FPs per patient, respectively.

* Accepted to MICCAI 2020

Via

Access Paper or Ask Questions

Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

Aug 27, 2020

Zhuotun Zhu, Dakai Jin, Ke Yan, Tsung-Ying Ho, Xianghua Ye, Dazhou Guo, Chun-Hung Chao, Jing Xiao, Alan Yuille, Le Lu

Figure 1 for Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

Figure 2 for Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

Figure 3 for Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

Figure 4 for Lymph Node Gross Tumor Volume Detection and Segmentation via Distance-based Gating using 3D CT/PET Imaging in Radiotherapy

Abstract:Finding, identifying and segmenting suspicious cancer metastasized lymph nodes from 3D multi-modality imaging is a clinical task of paramount importance. In radiotherapy, they are referred to as Lymph Node Gross Tumor Volume (GTVLN). Determining and delineating the spread of GTVLN is essential in defining the corresponding resection and irradiating regions for the downstream workflows of surgical resection and radiotherapy of various cancers. In this work, we propose an effective distance-based gating approach to simulate and simplify the high-level reasoning protocols conducted by radiation oncologists, in a divide-and-conquer manner. GTVLN is divided into two subgroups of tumor-proximal and tumor-distal, respectively, by means of binary or soft distance gating. This is motivated by the observation that each category can have distinct though overlapping distributions of appearance, size and other LN characteristics. A novel multi-branch detection-by-segmentation network is trained with each branch specializing on learning one GTVLN category features, and outputs from multi-branch are fused in inference. The proposed method is evaluated on an in-house dataset of $141$ esophageal cancer patients with both PET and CT imaging modalities. Our results validate significant improvements on the mean recall from $72.5\%$ to $78.2\%$, as compared to previous state-of-the-art work. The highest achieved GTVLN recall of $82.5\%$ at $20\%$ precision is clinically relevant and valuable since human observers tend to have low sensitivity (around $80\%$ for the most experienced radiation oncologists, as reported by literature).

* MICCAI2020

Via

Access Paper or Ask Questions

Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

May 27, 2020

Zhuotun Zhu, Ke Yan, Dakai Jin, Jinzheng Cai, Tsung-Ying Ho, Adam P Harrison, Dazhou Guo, Chun-Hung Chao, Xianghua Ye, Jing Xiao(+2 more)

Figure 1 for Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

Figure 2 for Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

Figure 3 for Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

Figure 4 for Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

Abstract:Finding and identifying scatteredly-distributed, small, and critically important objects in 3D oncology images is very challenging. We focus on the detection and segmentation of oncology-significant (or suspicious cancer metastasized) lymph nodes (OSLNs), which has not been studied before as a computational task. Determining and delineating the spread of OSLNs is essential in defining the corresponding resection/irradiating regions for the downstream workflows of surgical resection and radiotherapy of various cancers. For patients who are treated with radiotherapy, this task is performed by experienced radiation oncologists that involves high-level reasoning on whether LNs are metastasized, which is subject to high inter-observer variations. In this work, we propose a divide-and-conquer decision stratification approach that divides OSLNs into tumor-proximal and tumor-distal categories. This is motivated by the observation that each category has its own different underlying distributions in appearance, size and other characteristics. Two separate detection-by-segmentation networks are trained per category and fused. To further reduce false positives (FP), we present a novel global-local network (GLNet) that combines high-level lesion characteristics with features learned from localized 3D image patches. Our method is evaluated on a dataset of 141 esophageal cancer patients with PET and CT modalities (the largest to-date). Our results significantly improve the recall from $45\%$ to $67\%$ at $3$ FPs per patient as compared to previous state-of-the-art methods. The highest achieved OSLN recall of $0.828$ is clinically relevant and valuable.

* 14 pages, 4 Figures

Via

Access Paper or Ask Questions

Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

Apr 17, 2020

Dazhou Guo, Dakai Jin, Zhuotun Zhu, Tsung-Ying Ho, Adam P. Harrison, Chun-Hung Chao, Jing Xiao, Alan Yuille, Chien-Yu Lin, Le Lu

Figure 1 for Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

Figure 2 for Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

Figure 3 for Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

Figure 4 for Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search

Abstract:OAR segmentation is a critical step in radiotherapy of head and neck (H&N) cancer, where inconsistencies across radiation oncologists and prohibitive labor costs motivate automated approaches. However, leading methods using standard fully convolutional network workflows that are challenged when the number of OARs becomes large, e.g. > 40. For such scenarios, insights can be gained from the stratification approaches seen in manual clinical OAR delineation. This is the goal of our work, where we introduce stratified organ at risk segmentation (SOARS), an approach that stratifies OARs into anchor, mid-level, and small & hard (S&H) categories. SOARS stratifies across two dimensions. The first dimension is that distinct processing pipelines are used for each OAR category. In particular, inspired by clinical practices, anchor OARs are used to guide the mid-level and S&H categories. The second dimension is that distinct network architectures are used to manage the significant contrast, size, and anatomy variations between different OARs. We use differentiable neural architecture search (NAS), allowing the network to choose among 2D, 3D or Pseudo-3D convolutions. Extensive 4-fold cross-validation on 142 H&N cancer patients with 42 manually labeled OARs, the most comprehensive OAR dataset to date, demonstrates that both pipeline- and NAS-stratification significantly improves quantitative performance over the state-of-the-art (from 69.52% to 73.68% in absolute Dice scores). Thus, SOARS provides a powerful and principled means to manage the highly complex segmentation space of OARs.

* IEEE CVPR 2020

Via

Access Paper or Ask Questions

Deep Esophageal Clinical Target Volume Delineation using Encoded 3D Spatial Context of Tumors, Lymph Nodes, and Organs At Risk

Sep 06, 2019

Dakai Jin, Dazhou Guo, Tsung-Ying Ho, Adam P. Harrison, Jing Xiao, Chen-kan Tseng, Le Lu

Figure 1 for Deep Esophageal Clinical Target Volume Delineation using Encoded 3D Spatial Context of Tumors, Lymph Nodes, and Organs At Risk

Figure 2 for Deep Esophageal Clinical Target Volume Delineation using Encoded 3D Spatial Context of Tumors, Lymph Nodes, and Organs At Risk

Figure 3 for Deep Esophageal Clinical Target Volume Delineation using Encoded 3D Spatial Context of Tumors, Lymph Nodes, and Organs At Risk

Figure 4 for Deep Esophageal Clinical Target Volume Delineation using Encoded 3D Spatial Context of Tumors, Lymph Nodes, and Organs At Risk

Abstract:Clinical target volume (CTV) delineation from radiotherapy computed tomography (RTCT) images is used to define the treatment areas containing the gross tumor volume (GTV) and/or sub-clinical malignant disease for radiotherapy (RT). High intra- and inter-user variability makes this a particularly difficult task for esophageal cancer. This motivates automated solutions, which is the aim of our work. Because CTV delineation is highly context-dependent--it must encompass the GTV and regional lymph nodes (LNs) while also avoiding excessive exposure to the organs at risk (OARs)--we formulate it as a deep contextual appearance-based problem using encoded spatial contexts of these anatomical structures. This allows the deep network to better learn from and emulate the margin- and appearance-based delineation performed by human physicians. Additionally, we develop domain-specific data augmentation to inject robustness to our system. Finally, we show that a simple 3D progressive holistically nested network (PHNN), which avoids computationally heavy decoding paths while still aggregating features at different levels of context, can outperform more complicated networks. Cross-validated experiments on a dataset of 135 esophageal cancer patients demonstrate that our encoded spatial context approach can produce concrete performance improvements, with an average Dice score of 83.9% and an average surface distance of 4.2 mm, representing improvements of 3.8% and 2.4 mm, respectively, over the state-of-the-art approach.

* MICCAI 2019 (early accept)

Via

Access Paper or Ask Questions