Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sumanth Balaji

Beyond IVR: Benchmarking Customer Support LLM Agents for Business-Adherence

Jan 02, 2026

Sumanth Balaji, Piyush Mishra, Aashraya Sachdeva, Suraj Agrawal

Abstract:Traditional customer support systems, such as Interactive Voice Response (IVR), rely on rigid scripts and lack the flexibility required for handling complex, policy-driven tasks. While large language model (LLM) agents offer a promising alternative, evaluating their ability to act in accordance with business rules and real-world support workflows remains an open challenge. Existing benchmarks primarily focus on tool usage or task completion, overlooking an agent's capacity to adhere to multi-step policies, navigate task dependencies, and remain robust to unpredictable user or environment behavior. In this work, we introduce JourneyBench, a benchmark designed to assess policy-aware agents in customer support. JourneyBench leverages graph representations to generate diverse, realistic support scenarios and proposes the User Journey Coverage Score, a novel metric to measure policy adherence. We evaluate multiple state-of-the-art LLMs using two agent designs: a Static-Prompt Agent (SPA) and a Dynamic-Prompt Agent (DPA) that explicitly models policy control. Across 703 conversations in three domains, we show that DPA significantly boosts policy adherence, even allowing smaller models like GPT-4o-mini to outperform more capable ones like GPT-4o. Our findings demonstrate the importance of structured orchestration and establish JourneyBench as a critical resource to advance AI-driven customer support beyond IVR-era limitations.

* 17 pages, 3 figures, preprint

Via

Access Paper or Ask Questions

Investigating Strategies for Clause Recommendation

Jan 21, 2023

Sagar Joshi, Sumanth Balaji, Jerrin Thomas, Aparna Garimella, Vasudeva Varma

Figure 1 for Investigating Strategies for Clause Recommendation

Figure 2 for Investigating Strategies for Clause Recommendation

Figure 3 for Investigating Strategies for Clause Recommendation

Figure 4 for Investigating Strategies for Clause Recommendation

Abstract:Clause recommendation is the problem of recommending a clause to a legal contract, given the context of the contract in question and the clause type to which the clause should belong. With not much prior work being done toward the generation of legal contracts, this problem was proposed as a first step toward the bigger problem of contract generation. As an open-ended text generation problem, the distinguishing characteristics of this problem lie in the nature of legal language as a sublanguage and the considerable similarity of textual content within the clauses of a specific type. This similarity aspect in legal clauses drives us to investigate the importance of similar contracts' representation for recommending clauses. In our work, we experiment with generating clauses for 15 commonly occurring clause types in contracts expanding upon the previous work on this problem and analyzing clause recommendations in varying settings using information derived from similar contracts.

* Volume 362: Legal Knowledge and Information Systems (2022), Frontiers in Artificial Intelligence and Applications
* Published in Legal Knowledge and Information Systems (JURIX) 2022. (10 pages, 4 figures)

Via

Access Paper or Ask Questions

Graph-based Keyword Planning for Legal Clause Generation from Topics

Jan 07, 2023

Sagar Joshi, Sumanth Balaji, Aparna Garimella, Vasudeva Varma

Figure 1 for Graph-based Keyword Planning for Legal Clause Generation from Topics

Figure 2 for Graph-based Keyword Planning for Legal Clause Generation from Topics

Figure 3 for Graph-based Keyword Planning for Legal Clause Generation from Topics

Figure 4 for Graph-based Keyword Planning for Legal Clause Generation from Topics

Abstract:Generating domain-specific content such as legal clauses based on minimal user-provided information can be of significant benefit in automating legal contract generation. In this paper, we propose a controllable graph-based mechanism that can generate legal clauses using only the topic or type of the legal clauses. Our pipeline consists of two stages involving a graph-based planner followed by a clause generator. The planner outlines the content of a legal clause as a sequence of keywords in the order of generic to more specific clause information based on the input topic using a controllable graph-based mechanism. The generation stage takes in a given plan and generates a clause. The pipeline consists of a graph-based planner followed by text generation. We illustrate the effectiveness of our proposed two-stage approach on a broad set of clause topics in contracts.

* To be published in the Natural Legal Language Processing Workshop, EMNLP 2022 (11 pages, 7 figures)

Via

Access Paper or Ask Questions