Picture for Jingzhao Zhang

Jingzhao Zhang

Scalable Model Merging with Progressive Layer-wise Distillation

Add code
Feb 18, 2025
Viaarxiv icon

Task Generalization With AutoRegressive Compositional Structure: Can Learning From $\d$ Tasks Generalize to $\d^{T}$ Tasks?

Add code
Feb 13, 2025
Viaarxiv icon

Second-Order Min-Max Optimization with Lazy Hessians

Add code
Oct 12, 2024
Figure 1 for Second-Order Min-Max Optimization with Lazy Hessians
Figure 2 for Second-Order Min-Max Optimization with Lazy Hessians
Figure 3 for Second-Order Min-Max Optimization with Lazy Hessians
Viaarxiv icon

From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency

Add code
Oct 07, 2024
Figure 1 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Figure 2 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Figure 3 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Figure 4 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Viaarxiv icon

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems

Add code
Sep 10, 2024
Viaarxiv icon

Towards Black-Box Membership Inference Attack for Diffusion Models

Add code
May 25, 2024
Figure 1 for Towards Black-Box Membership Inference Attack for Diffusion Models
Figure 2 for Towards Black-Box Membership Inference Attack for Diffusion Models
Figure 3 for Towards Black-Box Membership Inference Attack for Diffusion Models
Figure 4 for Towards Black-Box Membership Inference Attack for Diffusion Models
Viaarxiv icon

Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning

Add code
May 04, 2024
Viaarxiv icon

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Add code
Feb 15, 2024
Figure 1 for Efficient Sampling on Riemannian Manifolds via Langevin MCMC
Viaarxiv icon

Iteratively Learn Diverse Strategies with State Distance Information

Add code
Oct 23, 2023
Figure 1 for Iteratively Learn Diverse Strategies with State Distance Information
Figure 2 for Iteratively Learn Diverse Strategies with State Distance Information
Figure 3 for Iteratively Learn Diverse Strategies with State Distance Information
Figure 4 for Iteratively Learn Diverse Strategies with State Distance Information
Viaarxiv icon

A Quadratic Synchronization Rule for Distributed Deep Learning

Add code
Oct 22, 2023
Viaarxiv icon