Picture for Shihan Dou

Shihan Dou

Multi-Programming Language Sandbox for LLMs

Add code
Oct 30, 2024
Figure 1 for Multi-Programming Language Sandbox for LLMs
Figure 2 for Multi-Programming Language Sandbox for LLMs
Figure 3 for Multi-Programming Language Sandbox for LLMs
Figure 4 for Multi-Programming Language Sandbox for LLMs
Viaarxiv icon

RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

Add code
Oct 13, 2024
Figure 1 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 2 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 3 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 4 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Viaarxiv icon

TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities

Add code
Jul 31, 2024
Figure 1 for TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities
Figure 2 for TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities
Figure 3 for TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities
Figure 4 for TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities
Viaarxiv icon

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

Add code
Jul 08, 2024
Figure 1 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 2 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 3 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 4 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Viaarxiv icon

SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance

Add code
Jun 26, 2024
Viaarxiv icon

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

Add code
Jun 17, 2024
Viaarxiv icon

MetaRM: Shifted Distributions Alignment via Meta-Learning

Add code
May 01, 2024
Viaarxiv icon

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

Add code
Mar 18, 2024
Viaarxiv icon

Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution

Add code
Feb 27, 2024
Figure 1 for Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution
Figure 2 for Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution
Figure 3 for Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution
Figure 4 for Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution
Viaarxiv icon

CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models

Add code
Feb 26, 2024
Viaarxiv icon