Picture for Shi Dong

Shi Dong

Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question Classification

Add code
Nov 04, 2024
Viaarxiv icon

RLHF and IIA: Perverse Incentives

Add code
Dec 02, 2023
Figure 1 for RLHF and IIA: Perverse Incentives
Figure 2 for RLHF and IIA: Perverse Incentives
Figure 3 for RLHF and IIA: Perverse Incentives
Figure 4 for RLHF and IIA: Perverse Incentives
Viaarxiv icon

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Add code
Jun 06, 2023
Viaarxiv icon

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

Add code
May 19, 2023
Viaarxiv icon

Inclusive Artificial Intelligence

Add code
Dec 24, 2022
Viaarxiv icon

Posterior Sampling for Continuing Environments

Add code
Nov 29, 2022
Viaarxiv icon

A unified interpretable intelligent learning diagnosis framework for smart education

Add code
Jul 07, 2022
Figure 1 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 2 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 3 for A unified interpretable intelligent learning diagnosis framework for smart education
Figure 4 for A unified interpretable intelligent learning diagnosis framework for smart education
Viaarxiv icon

Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State

Add code
Mar 08, 2021
Figure 1 for Simple Agent, Complex Environment: Efficient Reinforcement Learning with Agent State
Viaarxiv icon

Provably Efficient Reinforcement Learning with Aggregated States

Add code
Dec 13, 2019
Viaarxiv icon

Comments on the Du-Kakade-Wang-Yang Lower Bounds

Add code
Nov 18, 2019
Figure 1 for Comments on the Du-Kakade-Wang-Yang Lower Bounds
Viaarxiv icon