Picture for Mingze Wang

Mingze Wang

The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training

Add code
Feb 26, 2025
Viaarxiv icon

CCExpert: Advancing MLLM Capability in Remote Sensing Change Captioning with Difference-Aware Integration and a Foundational Dataset

Add code
Nov 18, 2024
Figure 1 for CCExpert: Advancing MLLM Capability in Remote Sensing Change Captioning with Difference-Aware Integration and a Foundational Dataset
Figure 2 for CCExpert: Advancing MLLM Capability in Remote Sensing Change Captioning with Difference-Aware Integration and a Foundational Dataset
Figure 3 for CCExpert: Advancing MLLM Capability in Remote Sensing Change Captioning with Difference-Aware Integration and a Foundational Dataset
Figure 4 for CCExpert: Advancing MLLM Capability in Remote Sensing Change Captioning with Difference-Aware Integration and a Foundational Dataset
Viaarxiv icon

How Transformers Implement Induction Heads: Approximation and Optimization Analysis

Add code
Oct 15, 2024
Figure 1 for How Transformers Implement Induction Heads: Approximation and Optimization Analysis
Viaarxiv icon

Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training

Add code
Oct 14, 2024
Figure 1 for Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Figure 2 for Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Figure 3 for Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Figure 4 for Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Viaarxiv icon

Incorporate LLMs with Influential Recommender System

Add code
Sep 07, 2024
Viaarxiv icon

Are AI-Generated Text Detectors Robust to Adversarial Perturbations?

Add code
Jun 03, 2024
Figure 1 for Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
Figure 2 for Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
Figure 3 for Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
Figure 4 for Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
Viaarxiv icon

Improving Generalization and Convergence by Enhancing Implicit Regularization

Add code
May 31, 2024
Figure 1 for Improving Generalization and Convergence by Enhancing Implicit Regularization
Figure 2 for Improving Generalization and Convergence by Enhancing Implicit Regularization
Figure 3 for Improving Generalization and Convergence by Enhancing Implicit Regularization
Figure 4 for Improving Generalization and Convergence by Enhancing Implicit Regularization
Viaarxiv icon

RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model

Add code
Mar 12, 2024
Viaarxiv icon

The Implicit Bias of Gradient Noise: A Symmetry Perspective

Add code
Feb 11, 2024
Figure 1 for The Implicit Bias of Gradient Noise: A Symmetry Perspective
Figure 2 for The Implicit Bias of Gradient Noise: A Symmetry Perspective
Figure 3 for The Implicit Bias of Gradient Noise: A Symmetry Perspective
Figure 4 for The Implicit Bias of Gradient Noise: A Symmetry Perspective
Viaarxiv icon

Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling

Add code
Feb 06, 2024
Viaarxiv icon