Picture for Kaixuan Ji

Kaixuan Ji

Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization

Add code
Oct 11, 2024
Viaarxiv icon

Self-Play Preference Optimization for Language Model Alignment

Add code
May 01, 2024
Viaarxiv icon

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation

Add code
Feb 15, 2024
Viaarxiv icon

Reinforcement Learning from Human Feedback with Active Queries

Add code
Feb 14, 2024
Viaarxiv icon

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Add code
Jan 02, 2024
Viaarxiv icon

Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment

Add code
Oct 16, 2023
Viaarxiv icon

BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation

Add code
Oct 16, 2023
Viaarxiv icon

Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

Add code
May 15, 2023
Viaarxiv icon

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

Add code
Jul 14, 2022
Figure 1 for Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers
Figure 2 for Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers
Figure 3 for Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers
Figure 4 for Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers
Viaarxiv icon

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

Add code
Oct 18, 2021
Figure 1 for P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Figure 2 for P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Figure 3 for P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Figure 4 for P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Viaarxiv icon