Picture for Jun Sun

Jun Sun

The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents: A Roadmap

Add code
Dec 09, 2024
Viaarxiv icon

CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization

Add code
Nov 18, 2024
Figure 1 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Figure 2 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Figure 3 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Figure 4 for CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization
Viaarxiv icon

LLMScan: Causal Scan for LLM Misbehavior Detection

Add code
Oct 23, 2024
Viaarxiv icon

UniAdapt: A Universal Adapter for Knowledge Calibration

Add code
Oct 01, 2024
Viaarxiv icon

Adversarial Suffixes May Be Features Too!

Add code
Oct 01, 2024
Viaarxiv icon

Do Influence Functions Work on Large Language Models?

Add code
Sep 30, 2024
Figure 1 for Do Influence Functions Work on Large Language Models?
Figure 2 for Do Influence Functions Work on Large Language Models?
Figure 3 for Do Influence Functions Work on Large Language Models?
Figure 4 for Do Influence Functions Work on Large Language Models?
Viaarxiv icon

Are Existing Road Design Guidelines Suitable for Autonomous Vehicles?

Add code
Sep 13, 2024
Viaarxiv icon

BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models

Add code
Aug 23, 2024
Viaarxiv icon

On-the-fly Synthesis for LTL over Finite Traces: An Efficient Approach that Counts

Add code
Aug 14, 2024
Viaarxiv icon

Certified Continual Learning for Neural Network Regression

Add code
Jul 09, 2024
Viaarxiv icon