Picture for Gelei Deng

Gelei Deng

Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment

Add code
Oct 18, 2024
Viaarxiv icon

GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models

Add code
Aug 22, 2024
Viaarxiv icon

Efficient Detection of Toxic Prompts in Large Language Models

Add code
Aug 21, 2024
Figure 1 for Efficient Detection of Toxic Prompts in Large Language Models
Figure 2 for Efficient Detection of Toxic Prompts in Large Language Models
Figure 3 for Efficient Detection of Toxic Prompts in Large Language Models
Figure 4 for Efficient Detection of Toxic Prompts in Large Language Models
Viaarxiv icon

Image-Based Geolocation Using Large Vision-Language Models

Add code
Aug 18, 2024
Viaarxiv icon

Continuous Embedding Attacks via Clipped Inputs in Jailbreaking Large Language Models

Add code
Jul 16, 2024
Viaarxiv icon

Source Code Summarization in the Era of Large Language Models

Add code
Jul 09, 2024
Viaarxiv icon

Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation

Add code
May 20, 2024
Viaarxiv icon

Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection

Add code
Apr 19, 2024
Viaarxiv icon

LLM Jailbreak Attack versus Defense Techniques -- A Comprehensive Study

Add code
Feb 21, 2024
Viaarxiv icon

Groot: Adversarial Testing for Generative Text-to-Image Models with Tree-based Semantic Transformation

Add code
Feb 19, 2024
Figure 1 for Groot: Adversarial Testing for Generative Text-to-Image Models with Tree-based Semantic Transformation
Figure 2 for Groot: Adversarial Testing for Generative Text-to-Image Models with Tree-based Semantic Transformation
Figure 3 for Groot: Adversarial Testing for Generative Text-to-Image Models with Tree-based Semantic Transformation
Figure 4 for Groot: Adversarial Testing for Generative Text-to-Image Models with Tree-based Semantic Transformation
Viaarxiv icon