Picture for Junze Yin

Junze Yin

Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method

Add code
Aug 21, 2024
Viaarxiv icon

Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers

Add code
May 08, 2024
Figure 1 for Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers
Figure 2 for Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers
Figure 3 for Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers
Viaarxiv icon

How to Inverting the Leverage Score Distribution?

Add code
Apr 21, 2024
Viaarxiv icon

Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression

Add code
Nov 26, 2023
Viaarxiv icon

Revisiting Quantum Algorithms for Linear Regressions: Quadratic Speedups without Data-Dependent Parameters

Add code
Nov 24, 2023
Viaarxiv icon

The Expressibility of Polynomial based Attention Scheme

Add code
Oct 30, 2023
Viaarxiv icon

A Unified Scheme of ResNet and Softmax

Add code
Sep 23, 2023
Figure 1 for A Unified Scheme of ResNet and Softmax
Figure 2 for A Unified Scheme of ResNet and Softmax
Figure 3 for A Unified Scheme of ResNet and Softmax
Figure 4 for A Unified Scheme of ResNet and Softmax
Viaarxiv icon

A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time

Add code
Sep 14, 2023
Figure 1 for A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Figure 2 for A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Figure 3 for A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Figure 4 for A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Viaarxiv icon

Solving Attention Kernel Regression Problem via Pre-conditioner

Add code
Aug 28, 2023
Viaarxiv icon

GradientCoin: A Peer-to-Peer Decentralized Large Language Models

Add code
Aug 21, 2023
Viaarxiv icon