Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaoqi Qiu

Provable Model Provenance Set for Large Language Models

Jan 31, 2026

Xiaoqi Qiu, Hao Zeng, Zhiyu Hou, Hongxin Wei

Abstract:The growing prevalence of unauthorized model usage and misattribution has increased the need for reliable model provenance analysis. However, existing methods largely rely on heuristic fingerprint-matching rules that lack provable error control and often overlook the existence of multiple sources, leaving the reliability of their provenance claims unverified. In this work, we first formalize the model provenance problem with provable guarantees, requiring rigorous coverage of all true provenances at a prescribed confidence level. Then, we propose the Model Provenance Set (MPS), which employs a sequential test-and-exclusion procedure to adaptively construct a small set satisfying the guarantee. The key idea of MPS is to test the significance of provenance existence within a candidate pool, thereby establishing a provable asymptotic guarantee at a user-specific confidence level. Extensive experiments demonstrate that MPS effectively achieves target provenance coverage while strictly limiting the inclusion of unrelated models, and further reveal its potential for practical provenance analysis in attribution and auditing tasks.

Via

Access Paper or Ask Questions

A Survey on Natural Language Counterfactual Generation

Jul 04, 2024

Yongjie Wang, Xiaoqi Qiu, Yu Yue, Xu Guo, Zhiwei Zeng, Yuhong Feng, Zhiqi Shen

Figure 1 for A Survey on Natural Language Counterfactual Generation

Figure 2 for A Survey on Natural Language Counterfactual Generation

Figure 3 for A Survey on Natural Language Counterfactual Generation

Figure 4 for A Survey on Natural Language Counterfactual Generation

Abstract:Natural Language Counterfactual generation aims to minimally modify a given text such that the modified text will be classified into a different class. The generated counterfactuals provide insight into the reasoning behind a model's predictions by highlighting which words significantly influence the outcomes. Additionally, they can be used to detect model fairness issues or augment the training data to enhance the model's robustness. A substantial amount of research has been conducted to generate counterfactuals for various NLP tasks, employing different models and methodologies. With the rapid growth of studies in this field, a systematic review is crucial to guide future researchers and developers. To bridge this gap, this survey comprehensively overview textual counterfactual generation methods, particularly including those based on Large Language Models. We propose a new taxonomy that categorizes the generation methods into four groups and systematically summarize the metrics for evaluating the generation quality. Finally, we discuss ongoing research challenges and outline promising directions for future work.

* A survey paper

Via

Access Paper or Ask Questions

PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

Jun 09, 2024

Xiaoqi Qiu, Yongjie Wang, Xu Guo, Zhiwei Zeng, Yue Yu, Yuhong Feng, Chunyan Miao

Figure 1 for PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

Figure 2 for PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

Figure 3 for PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

Figure 4 for PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

Abstract:Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes. Training with CAD enhances model robustness against spurious features that happen to correlate with labels by spreading the casual relationships across different classes. Yet, recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information, inadvertently introducing biases that may impair performance on out-ofdistribution (OOD) datasets. To mitigate this issue, we employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues. We theoretically prove that contrastive loss can encourage models to leverage a broader range of features beyond those modified ones. Comprehensive experiments on two human-edited CAD datasets demonstrate that our proposed method outperforms the state-of-the-art on OOD datasets.

* Accepted by ACL 2024 main conference

Via

Access Paper or Ask Questions