Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Byung-Hak Kim

Agent-as-Judge for Factual Summarization of Long Narratives

Jan 17, 2025

Yeonseok Jeong, Minsoo Kim, Seung-won Hwang, Byung-Hak Kim

Figure 1 for Agent-as-Judge for Factual Summarization of Long Narratives

Figure 2 for Agent-as-Judge for Factual Summarization of Long Narratives

Figure 3 for Agent-as-Judge for Factual Summarization of Long Narratives

Figure 4 for Agent-as-Judge for Factual Summarization of Long Narratives

Abstract:Large Language Models (LLMs) have demonstrated near-human performance in summarization tasks based on traditional metrics such as ROUGE and BERTScore. However, these metrics do not adequately capture critical aspects of summarization quality, such as factual accuracy, particularly for long narratives (>100K tokens). Recent advances, such as LLM-as-a-Judge, address the limitations of metrics based on lexical similarity but still exhibit factual inconsistencies, especially in understanding character relationships and states. In this work, we introduce NarrativeFactScore, a novel "Agent-as-a-Judge" framework for evaluating and refining summaries. By leveraging a Character Knowledge Graph (CKG) extracted from input and generated summaries, NarrativeFactScore assesses the factual consistency and provides actionable guidance for refinement, such as identifying missing or erroneous facts. We demonstrate the effectiveness of NarrativeFactScore through a detailed workflow illustration and extensive validation on widely adopted benchmarks, achieving superior performance compared to competitive methods. Our results highlight the potential of agent-driven evaluation systems to improve the factual reliability of LLM-generated summaries.

Via

Access Paper or Ask Questions

Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging

Apr 16, 2023

Jielin Qiu, Peide Huang, Makiya Nakashima, Jaehyun Lee, Jiacheng Zhu, Wilson Tang, Pohao Chen, Christopher Nguyen, Byung-Hak Kim, Debbie Kwon(+3 more)

Abstract:Self-supervised learning is crucial for clinical imaging applications, given the lack of explicit labels in healthcare. However, conventional approaches that rely on precise vision-language alignment are not always feasible in complex clinical imaging modalities, such as cardiac magnetic resonance (CMR). CMR provides a comprehensive visualization of cardiac anatomy, physiology, and microstructure, making it challenging to interpret. Additionally, CMR reports require synthesizing information from sequences of images and different views, resulting in potentially weak alignment between the study and diagnosis report pair. To overcome these challenges, we propose \textbf{CMRformer}, a multimodal learning framework to jointly learn sequences of CMR images and associated cardiologist's reports. Moreover, one of the major obstacles to improving CMR study is the lack of large, publicly available datasets. To bridge this gap, we collected a large \textbf{CMR dataset}, which consists of 13,787 studies from clinical cases. By utilizing our proposed CMRformer and our collected dataset, we achieved remarkable performance in real-world clinical tasks, such as CMR image retrieval and diagnosis report retrieval. Furthermore, the learned representations are evaluated to be practically helpful for downstream applications, such as disease classification. Our work could potentially expedite progress in the CMR study and lead to more accurate and effective diagnosis and treatment.

* 24 pages

Via

Access Paper or Ask Questions

RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the Wild

Nov 02, 2022

Weiyao Wang, Byung-Hak Kim, Varun Ganapathi

Abstract:Recent advances in self-supervised learning (SSL) using large models to learn visual representations from natural images are rapidly closing the gap between the results produced by fully supervised learning and those produced by SSL on downstream vision tasks. Inspired by this advancement and primarily motivated by the emergence of tabular and structured document image applications, we investigate which self-supervised pretraining objectives, architectures, and fine-tuning strategies are most effective. To address these questions, we introduce RegCLR, a new self-supervised framework that combines contrastive and regularized methods and is compatible with the standard Vision Transformer architecture. Then, RegCLR is instantiated by integrating masked autoencoders as a representative example of a contrastive method and enhanced Barlow Twins as a representative example of a regularized method with configurable input image augmentations in both branches. Several real-world table recognition scenarios (e.g., extracting tables from document images), ranging from standard Word and Latex documents to even more challenging electronic health records (EHR) computer screen images, have been shown to benefit greatly from the representations learned from this new framework, with detection average-precision (AP) improving relatively by 4.8% for Table, 11.8% for Column, and 11.1% for GUI objects over a previous fully supervised baseline on real-world EHR screen images.

* To be presented at the 36th Conference on Neural Information Processing Systems, New Orleans, USA, on December 2, 2022, at the First Table Representation Learning (TRL) Workshop

Via

Access Paper or Ask Questions

Medical Codes Prediction from Clinical Notes: From Human Coders to Machines

Oct 30, 2022

Byung-Hak Kim

Abstract:Prediction of medical codes from clinical notes is a practical and essential need for every healthcare delivery organization within current medical systems. Automating annotation will save significant time and excessive effort that human coders spend today. However, the biggest challenge is directly identifying appropriate medical codes from several thousands of high-dimensional codes from unstructured free-text clinical notes. This complex medical codes prediction problem from clinical notes has received substantial interest in the NLP community, and several recent studies have shown the state-of-the-art code prediction results of full-fledged deep learning-based methods. This progress raises the fundamental question of how far automated machine learning systems are from human coders' working performance, as well as the important question of how well current explainability methods apply to advanced neural network models such as transformers. This is to predict correct codes and present references in clinical notes that support code prediction, as this level of explainability and accuracy of the prediction outcomes is critical to gaining trust from professional medical coders.

* The 11th Bay Area Machine Learning Symposium (BayLearn 2022), San Francisco, CA, October 20, 2022. arXiv admin note: substantial text overlap with arXiv:2210.15882. substantial text overlap with arXiv:2107.10650

Via

Access Paper or Ask Questions

Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes?

Oct 28, 2022

Byung-Hak Kim, Zhongfen Deng, Philip S. Yu, Varun Ganapathi

Figure 1 for Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes?

Figure 2 for Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes?

Figure 3 for Can Current Explainability Help Provide References in Clinical Notes to Support Humans Annotate Medical Codes?

Abstract:The medical codes prediction problem from clinical notes has received substantial interest in the NLP community, and several recent studies have shown the state-of-the-art (SOTA) code prediction results of full-fledged deep learning-based methods. However, most previous SOTA works based on deep learning are still in early stages in terms of providing textual references and explanations of the predicted codes, despite the fact that this level of explainability of the prediction outcomes is critical to gaining trust from professional medical coders. This raises the important question of how well current explainability methods apply to advanced neural network models such as transformers to predict correct codes and present references in clinical notes that support code prediction. First, we present an explainable Read, Attend, and Code (xRAC) framework and assess two approaches, attention score-based xRAC-ATTN and model-agnostic knowledge-distillation-based xRAC-KD, through simplified but thorough human-grounded evaluations with SOTA transformer-based model, RAC. We find that the supporting evidence text highlighted by xRAC-ATTN is of higher quality than xRAC-KD whereas xRAC-KD has potential advantages in production deployment scenarios. More importantly, we show for the first time that, given the current state of explainability methodologies, using the SOTA medical codes prediction system still requires the expertise and competencies of professional coders, even though its prediction accuracy is superior to that of human coders. This, we believe, is a very meaningful step toward developing explainable and accurate machine learning systems for fully autonomous medical code prediction from clinical notes.

* To appear in Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis (Louhi 2022), Virtual, December 7, 2022

Via

Access Paper or Ask Questions

Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines

Jul 10, 2021

Byung-Hak Kim, Varun Ganapathi

Figure 1 for Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines

Figure 2 for Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines

Figure 3 for Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines

Figure 4 for Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines

Abstract:Prediction of medical codes from clinical notes is both a practical and essential need for every healthcare delivery organization within current medical systems. Automating annotation will save significant time and excessive effort spent by human coders today. However, the biggest challenge is directly identifying appropriate medical codes out of several thousands of high-dimensional codes from unstructured free-text clinical notes. In the past three years, with Convolutional Neural Networks (CNN) and Long Short-Term Memory (LTSM) networks, there have been vast improvements in tackling the most challenging benchmark of the MIMIC-III-full-label inpatient clinical notes dataset. This progress raises the fundamental question of how far automated machine learning (ML) systems are from human coders' working performance. We assessed the baseline of human coders' performance on the same subsampled testing set. We also present our Read, Attend, and Code (RAC) model for learning the medical code assignment mappings. By connecting convolved embeddings with self-attention and code-title guided attention modules, combined with sentence permutation-based data augmentations and stochastic weight averaging training, RAC establishes a new state of the art (SOTA), considerably outperforming the current best Macro-F1 by 18.7%, and reaches past the human-level coding baseline. This new milestone marks a meaningful step toward fully autonomous medical coding (AMC) in machines reaching parity with human coders' performance in medical code prediction.

* To appear in Proceedings of Machine Learning Research, Volume 149: Machine Learning for Healthcare Conference (MLHC), Virtual, August 6-7, 2021

Via

Access Paper or Ask Questions

Deep Claim: Payer Response Prediction from Claims Data with Deep Learning

Jul 13, 2020

Byung-Hak Kim, Seshadri Sridharan, Andy Atwal, Varun Ganapathi

Figure 1 for Deep Claim: Payer Response Prediction from Claims Data with Deep Learning

Figure 2 for Deep Claim: Payer Response Prediction from Claims Data with Deep Learning

Figure 3 for Deep Claim: Payer Response Prediction from Claims Data with Deep Learning

Figure 4 for Deep Claim: Payer Response Prediction from Claims Data with Deep Learning

Abstract:Each year, almost 10% of claims are denied by payers (i.e., health insurance plans). With the cost to recover these denials and underpayments, predicting payer response (likelihood of payment) from claims data with a high degree of accuracy and precision is anticipated to improve healthcare staffs' performance productivity and drive better patient financial experience and satisfaction in the revenue cycle (Barkholz, 2017). However, constructing advanced predictive analytics models has been considered challenging in the last twenty years. That said, we propose a (low-level) context-dependent compact representation of patients' historical claim records by effectively learning complicated dependencies in the (high-level) claim inputs. Built on this new latent representation, we demonstrate that a deep learning-based framework, Deep Claim, can accurately predict various responses from multiple payers using 2,905,026 de-identified claims data from two US health systems. Deep Claim's improvements over carefully chosen baselines in predicting claim denials are most pronounced as 22.21% relative recall gain (at 95% precision) on Health System A, which implies Deep Claim can find 22.21% more denials than the best baseline system.

* To be presented at the Healthcare Systems, Population Health, and the Role of Health-Tech (HSYS) Workshop at the 37th International Conference on Machine Learning, Vienna, Austria, July 13-18, 2020

Via

Access Paper or Ask Questions

LumièreNet: Lecture Video Synthesis from Audio

Jul 04, 2019

Byung-Hak Kim, Varun Ganapathi

Figure 1 for LumièreNet: Lecture Video Synthesis from Audio

Figure 2 for LumièreNet: Lecture Video Synthesis from Audio

Figure 3 for LumièreNet: Lecture Video Synthesis from Audio

Figure 4 for LumièreNet: Lecture Video Synthesis from Audio

Abstract:We present Lumi\`ereNet, a simple, modular, and completely deep-learning based architecture that synthesizes, high quality, full-pose headshot lecture videos from instructor's new audio narration of any length. Unlike prior works, Lumi\`ereNet is entirely composed of trainable neural network modules to learn mapping functions from the audio to video through (intermediate) estimated pose-based compact and abstract latent codes. Our video demos are available at [22] and [23].

Via

Access Paper or Ask Questions

Deep Learning to Predict Student Outcomes

Apr 27, 2019

Byung-Hak Kim

Figure 1 for Deep Learning to Predict Student Outcomes

Figure 2 for Deep Learning to Predict Student Outcomes

Abstract:The increasingly fast development cycle for online course contents, along with the diverse student demographics in each online classroom, make real-time student outcomes prediction an interesting topic for both industrial research and practical needs. In this paper, we tackle the problem of real-time student performance prediction in an on-going course using a domain adaptation framework. This framework is a system trained on labeled student outcome data from previous coursework but is meant to be deployed on another course. In particular, we introduce a GritNet architecture, and develop an unsupervised domain adaptation method to transfer a GritNet trained on a past course to a new course without any student outcome label. Our results for real Udacity student graduation predictions show that the GritNet not only generalizes well from one course to another across different Nanodegree programs, but also enhances real-time predictions explicitly in the first few weeks when accurate predictions are most challenging.

* Accepted as oral presentation to ICLR 2019, AI for Social Good Workshop. arXiv admin note: substantial text overlap with arXiv:1809.06686, arXiv:1804.07405

Via

Access Paper or Ask Questions

GritNet 2: Real-Time Student Performance Prediction with Domain Adaptation

Sep 07, 2018

Byung-Hak Kim, Ethan Vizitei, Varun Ganapathi

Figure 1 for GritNet 2: Real-Time Student Performance Prediction with Domain Adaptation

Figure 2 for GritNet 2: Real-Time Student Performance Prediction with Domain Adaptation

Figure 3 for GritNet 2: Real-Time Student Performance Prediction with Domain Adaptation

Figure 4 for GritNet 2: Real-Time Student Performance Prediction with Domain Adaptation

Abstract:Increasingly fast development and update cycle of online course contents, and diverse demographics of students in each online classroom, make student performance prediction in real-time (before the course finishes) an interesting topic for both industrial research and practical needs. In that, we tackle the problem of real-time student performance prediction with on-going courses in domain adaptation framework, which is a system trained on students' labeled outcome from one previous coursework but is meant to be deployed on another. In particular, we first review recently-developed GritNet architecture which is the current state of the art for student performance prediction problem, and introduce a new unsupervised domain adaptation method to transfer a GritNet trained on a past course to a new course without any (students' outcome) label. Our results for real Udacity students' graduation predictions show that the GritNet not only generalizes well from one course to another across different Nanodegree programs, but enhances real-time predictions explicitly in the first few weeks when accurate predictions are most challenging.

* Section 2 is in part a reprint of the material in arXiv:1804.07405

Via

Access Paper or Ask Questions