Picture for Ximing Lu

Ximing Lu

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

Add code
Oct 05, 2024
Viaarxiv icon

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions

Add code
Sep 26, 2024
Figure 1 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 2 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 3 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 4 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Viaarxiv icon

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

Add code
Aug 28, 2024
Viaarxiv icon

Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness

Add code
Jul 02, 2024
Viaarxiv icon

How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models

Add code
Jun 29, 2024
Figure 1 for How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
Figure 2 for How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
Figure 3 for How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
Figure 4 for How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models
Viaarxiv icon

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Add code
Jun 26, 2024
Figure 1 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Figure 2 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Figure 3 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Figure 4 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Viaarxiv icon

Information-Theoretic Distillation for Reference-less Summarization

Add code
Mar 20, 2024
Viaarxiv icon

JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models

Add code
Feb 13, 2024
Figure 1 for JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models
Figure 2 for JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models
Figure 3 for JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models
Figure 4 for JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models
Viaarxiv icon

A Roadmap to Pluralistic Alignment

Add code
Feb 07, 2024
Viaarxiv icon

Localized Symbolic Knowledge Distillation for Visual Commonsense Models

Add code
Dec 12, 2023
Viaarxiv icon