Picture for Abel Salinas

Abel Salinas

Risk and Response in Large Language Models: Evaluating Key Threat Categories

Add code
Mar 22, 2024
Viaarxiv icon

The Butterfly Effect of Altering Prompts: How Small Changes and Jailbreaks Affect Large Language Model Performance

Add code
Jan 09, 2024
Viaarxiv icon

"Im not Racist but": Discovering Bias in the Internal Knowledge of Large Language Models

Add code
Oct 13, 2023
Viaarxiv icon

The Unequal Opportunities of Large Language Models: Revealing Demographic Bias through Job Recommendations

Add code
Aug 03, 2023
Viaarxiv icon

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning

Add code
Jul 24, 2023
Viaarxiv icon