Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sohyun Park

KFinEval-Pilot: A Comprehensive Benchmark Suite for Korean Financial Language Understanding

Apr 17, 2025

Bokwang Hwang, Seonkyu Lim, Taewoong Kim, Yongjae Geun, Sunghyun Bang, Sohyun Park, Jihyun Park, Myeonggyu Lee, Jinwoo Lee, Yerin Kim(+17 more)

Abstract:We introduce KFinEval-Pilot, a benchmark suite specifically designed to evaluate large language models (LLMs) in the Korean financial domain. Addressing the limitations of existing English-centric benchmarks, KFinEval-Pilot comprises over 1,000 curated questions across three critical areas: financial knowledge, legal reasoning, and financial toxicity. The benchmark is constructed through a semi-automated pipeline that combines GPT-4-generated prompts with expert validation to ensure domain relevance and factual accuracy. We evaluate a range of representative LLMs and observe notable performance differences across models, with trade-offs between task accuracy and output safety across different model families. These results highlight persistent challenges in applying LLMs to high-stakes financial applications, particularly in reasoning and safety. Grounded in real-world financial use cases and aligned with the Korean regulatory and linguistic context, KFinEval-Pilot serves as an early diagnostic tool for developing safer and more reliable financial AI systems.

Via

Access Paper or Ask Questions

Label-aware Hard Negative Sampling Strategies with Momentum Contrastive Learning for Implicit Hate Speech Detection

Jun 12, 2024

Jaehoon Kim, Seungwan Jin, Sohyun Park, Someen Park, Kyungsik Han

Abstract:Detecting implicit hate speech that is not directly hateful remains a challenge. Recent research has attempted to detect implicit hate speech by applying contrastive learning to pre-trained language models such as BERT and RoBERTa, but the proposed models still do not have a significant advantage over cross-entropy loss-based learning. We found that contrastive learning based on randomly sampled batch data does not encourage the model to learn hard negative samples. In this work, we propose Label-aware Hard Negative sampling strategies (LAHN) that encourage the model to learn detailed features from hard negative samples, instead of naive negative samples in random batch, using momentum-integrated contrastive learning. LAHN outperforms the existing models for implicit hate speech detection both in- and cross-datasets. The code is available at https://github.com/Hanyang-HCC-Lab/LAHN

* Accepted to ACL 2024 Findings

Via

Access Paper or Ask Questions

HearHere: Mitigating Echo Chambers in News Consumption through an AI-based Web System

Feb 29, 2024

Youngseung Jeon, Jaehoon Kim, Sohyun Park, Yunyong Ko, Seongeun Ryu, Sang-Wook Kim, Kyungsik Han

Abstract:Considerable efforts are currently underway to mitigate the negative impacts of echo chambers, such as increased susceptibility to fake news and resistance towards accepting scientific evidence. Prior research has presented the development of computer systems that support the consumption of news information from diverse political perspectives to mitigate the echo chamber effect. However, existing studies still lack the ability to effectively support the key processes of news information consumption and quantitatively identify a political stance towards the information. In this paper, we present HearHere, an AI-based web system designed to help users accommodate information and opinions from diverse perspectives. HearHere facilitates the key processes of news information consumption through two visualizations. Visualization 1 provides political news with quantitative political stance information, derived from our graph-based political classification model, and users can experience diverse perspectives (Hear). Visualization 2 allows users to express their opinions on specific political issues in a comment form and observe the position of their own opinions relative to pro-liberal and pro-conservative comments presented on a map interface (Here). Through a user study with 94 participants, we demonstrate the feasibility of HearHere in supporting the consumption of information from various perspectives. Our findings highlight the importance of providing political stance information and quantifying users' political status as a means to mitigate political polarization. In addition, we propose design implications for system development, including the consideration of demographics such as political interest and providing users with initiatives.

* 34 pages, 6 figures, 6 tables, CSCW 2024

Via

Access Paper or Ask Questions

The Application of Affective Measures in Text-based Emotion Aware Recommender Systems

May 04, 2023

John Kalung Leung, Igor Griva, William G. Kennedy, Jason M. Kinser, Sohyun Park, Seo Young Lee

Abstract:This paper presents an innovative approach to address the problems researchers face in Emotion Aware Recommender Systems (EARS): the difficulty and cumbersome collecting voluminously good quality emotion-tagged datasets and an effective way to protect users' emotional data privacy. Without enough good-quality emotion-tagged datasets, researchers cannot conduct repeatable affective computing research in EARS that generates personalized recommendations based on users' emotional preferences. Similarly, if we fail to fully protect users' emotional data privacy, users could resist engaging with EARS services. This paper introduced a method that detects affective features in subjective passages using the Generative Pre-trained Transformer Technology, forming the basis of the Affective Index and Affective Index Indicator (AII). Eliminate the need for users to build an affective feature detection mechanism. The paper advocates for a separation of responsibility approach where users protect their emotional profile data while EARS service providers refrain from retaining or storing it. Service providers can update users' Affective Indices in memory without saving their privacy data, providing Affective Aware recommendations without compromising user privacy. This paper offers a solution to the subjectivity and variability of emotions, data privacy concerns, and evaluation metrics and benchmarks, paving the way for future EARS research.

Via

Access Paper or Ask Questions

KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate Political Stance Prediction

Mar 01, 2023

Yunyong Ko, Seongeun Ryu, Soeun Han, Yeongseung Jeon, Jaehoon Kim, Sohyun Park, Kyungsik Han, Hanghang Tong, Sang-Wook Kim

Figure 1 for KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate Political Stance Prediction

Figure 2 for KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate Political Stance Prediction

Figure 3 for KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate Political Stance Prediction

Figure 4 for KHAN: Knowledge-Aware Hierarchical Attention Networks for Accurate Political Stance Prediction

Abstract:The political stance prediction for news articles has been widely studied to mitigate the echo chamber effect -- people fall into their thoughts and reinforce their pre-existing beliefs. The previous works for the political stance problem focus on (1) identifying political factors that could reflect the political stance of a news article and (2) capturing those factors effectively. Despite their empirical successes, they are not sufficiently justified in terms of how effective their identified factors are in the political stance prediction. Motivated by this, in this work, we conduct a user study to investigate important factors in political stance prediction, and observe that the context and tone of a news article (implicit) and external knowledge for real-world entities appearing in the article (explicit) are important in determining its political stance. Based on this observation, we propose a novel knowledge-aware approach to political stance prediction (KHAN), employing (1) hierarchical attention networks (HAN) to learn the relationships among words and sentences in three different levels and (2) knowledge encoding (KE) to incorporate external knowledge for real-world entities into the process of political stance prediction. Also, to take into account the subtle and important difference between opposite political stances, we build two independent political knowledge graphs (KG) (i.e., KG-lib and KG-con) by ourselves and learn to fuse the different political knowledge. Through extensive evaluations on three real-world datasets, we demonstrate the superiority of DASH in terms of (1) accuracy, (2) efficiency, and (3) effectiveness.

* 12 pages, 5 figures, 10 tables, the Web Conference 2023 (WWW)

Via

Access Paper or Ask Questions