Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Changbong Kim

SLM as Guardian: Pioneering AI Safety with Small Language Models

May 30, 2024

Ohjoon Kwon, Donghyeon Jeon, Nayoung Choi, Gyu-Hwung Cho, Changbong Kim, Hyunwoo Lee, Inho Kang, Sun Kim, Taiwoo Park

Figure 1 for SLM as Guardian: Pioneering AI Safety with Small Language Models

Figure 2 for SLM as Guardian: Pioneering AI Safety with Small Language Models

Figure 3 for SLM as Guardian: Pioneering AI Safety with Small Language Models

Figure 4 for SLM as Guardian: Pioneering AI Safety with Small Language Models

Abstract:Most prior safety research of large language models (LLMs) has focused on enhancing the alignment of LLMs to better suit the safety requirements of humans. However, internalizing such safeguard features into larger models brought challenges of higher training cost and unintended degradation of helpfulness. To overcome such challenges, a modular approach employing a smaller LLM to detect harmful user queries is regarded as a convenient solution in designing LLM-based system with safety requirements. In this paper, we leverage a smaller LLM for both harmful query detection and safeguard response generation. We introduce our safety requirements and the taxonomy of harmfulness categories, and then propose a multi-task learning mechanism fusing the two tasks into a single model. We demonstrate the effectiveness of our approach, providing on par or surpassing harmful query detection and safeguard response performance compared to the publicly available LLMs.

Via

Access Paper or Ask Questions

Taxonomy and Analysis of Sensitive User Queries in Generative AI Search

Apr 05, 2024

Hwiyeol Jo, Taiwoo Park, Nayoung Choi, Changbong Kim, Ohjoon Kwon, Donghyeon Jeon, Hyunwoo Lee, Eui-Hyeon Lee, Kyoungho Shin, Sun Suk Lim(+3 more)

Figure 1 for Taxonomy and Analysis of Sensitive User Queries in Generative AI Search

Figure 2 for Taxonomy and Analysis of Sensitive User Queries in Generative AI Search

Figure 3 for Taxonomy and Analysis of Sensitive User Queries in Generative AI Search

Figure 4 for Taxonomy and Analysis of Sensitive User Queries in Generative AI Search

Abstract:Although there has been a growing interest among industries to integrate generative LLMs into their services, limited experiences and scarcity of resources acts as a barrier in launching and servicing large-scale LLM-based conversational services. In this paper, we share our experiences in developing and operating generative AI models within a national-scale search engine, with a specific focus on the sensitiveness of user queries. We propose a taxonomy for sensitive search queries, outline our approaches, and present a comprehensive analysis report on sensitive queries from actual users.

Via

Access Paper or Ask Questions