Picture for Huaqing Zhang

Huaqing Zhang

Task Generalization With AutoRegressive Compositional Structure: Can Learning From $\d$ Tasks Generalize to $\d^{T}$ Tasks?

Add code
Feb 13, 2025
Viaarxiv icon

From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency

Add code
Oct 07, 2024
Figure 1 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Figure 2 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Figure 3 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Figure 4 for From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Viaarxiv icon

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems

Add code
Sep 10, 2024
Viaarxiv icon

Compiler-Level Matrix Multiplication Optimization for Deep Learning

Add code
Sep 23, 2019
Figure 1 for Compiler-Level Matrix Multiplication Optimization for Deep Learning
Figure 2 for Compiler-Level Matrix Multiplication Optimization for Deep Learning
Figure 3 for Compiler-Level Matrix Multiplication Optimization for Deep Learning
Figure 4 for Compiler-Level Matrix Multiplication Optimization for Deep Learning
Viaarxiv icon

Gradient-Coherent Strong Regularization for Deep Neural Networks

Add code
Nov 20, 2018
Figure 1 for Gradient-Coherent Strong Regularization for Deep Neural Networks
Figure 2 for Gradient-Coherent Strong Regularization for Deep Neural Networks
Figure 3 for Gradient-Coherent Strong Regularization for Deep Neural Networks
Figure 4 for Gradient-Coherent Strong Regularization for Deep Neural Networks
Viaarxiv icon