Picture for Dan Qiao

Dan Qiao

Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

Add code
Jun 10, 2024
Viaarxiv icon

OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Add code
May 09, 2024
Figure 1 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Figure 2 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Figure 3 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Figure 4 for OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Viaarxiv icon

Differentially Private Reinforcement Learning with Self-Play

Add code
Apr 11, 2024
Viaarxiv icon

Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints

Add code
Feb 02, 2024
Viaarxiv icon

OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch

Add code
Oct 01, 2023
Figure 1 for OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Figure 2 for OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Figure 3 for OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Figure 4 for OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Viaarxiv icon

GameEval: Evaluating LLMs on Conversational Games

Add code
Aug 19, 2023
Viaarxiv icon

Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning

Add code
May 18, 2023
Viaarxiv icon

Logarithmic Switching Cost in Reinforcement Learning beyond Linear MDPs

Add code
Feb 24, 2023
Viaarxiv icon

Near-Optimal Differentially Private Reinforcement Learning

Add code
Dec 09, 2022
Viaarxiv icon

SelfMix: Robust Learning Against Textual Label Noise with Self-Mixup Training

Add code
Oct 11, 2022
Figure 1 for SelfMix: Robust Learning Against Textual Label Noise with Self-Mixup Training
Figure 2 for SelfMix: Robust Learning Against Textual Label Noise with Self-Mixup Training
Figure 3 for SelfMix: Robust Learning Against Textual Label Noise with Self-Mixup Training
Figure 4 for SelfMix: Robust Learning Against Textual Label Noise with Self-Mixup Training
Viaarxiv icon