Picture for David W. Zhang

David W. Zhang

RL-finetuning LLMs from on- and off-policy data with a single algorithm

Add code
Mar 25, 2025
Viaarxiv icon

Soft Policy Optimization: Online Off-Policy RL for Sequence Models

Add code
Mar 07, 2025
Figure 1 for Soft Policy Optimization: Online Off-Policy RL for Sequence Models
Figure 2 for Soft Policy Optimization: Online Off-Policy RL for Sequence Models
Viaarxiv icon

Ada-HGNN: Adaptive Sampling for Scalable Hypergraph Neural Networks

Add code
May 22, 2024
Viaarxiv icon

Graph Neural Networks for Learning Equivariant Representations of Neural Networks

Add code
Mar 20, 2024
Viaarxiv icon

Improved Generalization of Weight Space Networks via Augmentations

Add code
Feb 06, 2024
Viaarxiv icon

Diffusing More Objects for Semi-Supervised Domain Adaptation with Less Labeling

Add code
Dec 19, 2023
Viaarxiv icon

Data Augmentations in Deep Weight Spaces

Add code
Nov 15, 2023
Viaarxiv icon

Unlocking Slot Attention by Changing Optimal Transport Costs

Add code
Jan 30, 2023
Viaarxiv icon

Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation

Add code
Nov 23, 2021
Figure 1 for Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation
Figure 2 for Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation
Figure 3 for Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation
Figure 4 for Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation
Viaarxiv icon

Recurrently Predicting Hypergraphs

Add code
Jun 26, 2021
Figure 1 for Recurrently Predicting Hypergraphs
Figure 2 for Recurrently Predicting Hypergraphs
Figure 3 for Recurrently Predicting Hypergraphs
Figure 4 for Recurrently Predicting Hypergraphs
Viaarxiv icon