Picture for Jose Such

Jose Such

Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs

Add code
Feb 03, 2025
Figure 1 for Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs
Figure 2 for Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs
Figure 3 for Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs
Figure 4 for Towards Safer Chatbots: A Framework for Policy Compliance Evaluation of Custom GPTs
Viaarxiv icon

CASE-Bench: Context-Aware Safety Evaluation Benchmark for Large Language Models

Add code
Jan 24, 2025
Viaarxiv icon

A Holistic Indicator of Polarization to Measure Online Sexism

Add code
Apr 02, 2024
Figure 1 for A Holistic Indicator of Polarization to Measure Online Sexism
Figure 2 for A Holistic Indicator of Polarization to Measure Online Sexism
Figure 3 for A Holistic Indicator of Polarization to Measure Online Sexism
Figure 4 for A Holistic Indicator of Polarization to Measure Online Sexism
Viaarxiv icon

Moral Uncertainty and the Problem of Fanaticism

Add code
Dec 18, 2023
Viaarxiv icon

AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics

Add code
Aug 28, 2023
Figure 1 for AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics
Figure 2 for AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics
Figure 3 for AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics
Figure 4 for AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics
Viaarxiv icon

MalProtect: Stateful Defense Against Adversarial Query Attacks in ML-based Malware Detection

Add code
Feb 21, 2023
Viaarxiv icon

Effectiveness of Moving Target Defenses for Adversarial Attacks in ML-based Malware Detection

Add code
Feb 01, 2023
Viaarxiv icon

StratDef: a strategic defense against adversarial attacks in malware detection

Add code
Feb 15, 2022
Figure 1 for StratDef: a strategic defense against adversarial attacks in malware detection
Figure 2 for StratDef: a strategic defense against adversarial attacks in malware detection
Figure 3 for StratDef: a strategic defense against adversarial attacks in malware detection
Figure 4 for StratDef: a strategic defense against adversarial attacks in malware detection
Viaarxiv icon

Automating the GDPR Compliance Assessment for Cross-border Personal Data Transfers in Android Applications

Add code
Mar 12, 2021
Figure 1 for Automating the GDPR Compliance Assessment for Cross-border Personal Data Transfers in Android Applications
Figure 2 for Automating the GDPR Compliance Assessment for Cross-border Personal Data Transfers in Android Applications
Figure 3 for Automating the GDPR Compliance Assessment for Cross-border Personal Data Transfers in Android Applications
Figure 4 for Automating the GDPR Compliance Assessment for Cross-border Personal Data Transfers in Android Applications
Viaarxiv icon