Picture for Stephen Fitz

Stephen Fitz

Do GPT Language Models Suffer From Split Personality Disorder? The Advent Of Substrate-Free Psychometrics

Add code
Aug 15, 2024
Viaarxiv icon

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?

Add code
Jul 31, 2024
Viaarxiv icon

Hidden Holes: topological aspects of language models

Add code
Jun 09, 2024
Viaarxiv icon

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Add code
Mar 06, 2024
Figure 1 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 2 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 3 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 4 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Viaarxiv icon

Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings

Add code
Sep 17, 2023
Viaarxiv icon

Personality Traits in Large Language Models

Add code
Jul 01, 2023
Viaarxiv icon

Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations

Add code
Jun 01, 2021
Figure 1 for Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations
Figure 2 for Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations
Viaarxiv icon