Picture for Zhi Zhang

Zhi Zhang

Confidence Interval Construction and Conditional Variance Estimation with Dense ReLU Networks

Add code
Dec 29, 2024
Viaarxiv icon

Cross-modal Information Flow in Multimodal Large Language Models

Add code
Nov 27, 2024
Viaarxiv icon

Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective

Add code
Nov 27, 2024
Figure 1 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Figure 2 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Figure 3 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Figure 4 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Viaarxiv icon

Distributed Sign Momentum with Local Steps for Training Transformers

Add code
Nov 26, 2024
Viaarxiv icon

Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Add code
Nov 25, 2024
Figure 1 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Figure 2 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Figure 3 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Figure 4 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Viaarxiv icon

Dense ReLU Neural Networks for Temporal-spatial Model

Add code
Nov 15, 2024
Figure 1 for Dense ReLU Neural Networks for Temporal-spatial Model
Figure 2 for Dense ReLU Neural Networks for Temporal-spatial Model
Figure 3 for Dense ReLU Neural Networks for Temporal-spatial Model
Figure 4 for Dense ReLU Neural Networks for Temporal-spatial Model
Viaarxiv icon

From References to Insights: Collaborative Knowledge Minigraph Agents for Automating Scholarly Literature Review

Add code
Nov 09, 2024
Viaarxiv icon

Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory

Add code
Nov 01, 2024
Figure 1 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Figure 2 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Figure 3 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Figure 4 for Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory
Viaarxiv icon

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

Add code
Oct 20, 2024
Figure 1 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 2 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 3 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Figure 4 for SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Viaarxiv icon

DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Add code
Oct 15, 2024
Figure 1 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting
Figure 2 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting
Figure 3 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting
Figure 4 for DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting
Viaarxiv icon