Picture for Yan Sun

Yan Sun

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models

Add code
Jun 15, 2025
Viaarxiv icon

Investigating the Effects of Cognitive Biases in Prompts on Large Language Model Outputs

Add code
Jun 14, 2025
Viaarxiv icon

Foundations of Top-$k$ Decoding For Language Models

Add code
May 25, 2025
Viaarxiv icon

Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines

Add code
May 21, 2025
Viaarxiv icon

An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models

Add code
May 13, 2025
Viaarxiv icon

Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning

Add code
Apr 22, 2025
Viaarxiv icon

A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models

Add code
Feb 22, 2025
Figure 1 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 2 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 3 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Figure 4 for A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models
Viaarxiv icon

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

Add code
Jan 31, 2025
Figure 1 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Figure 2 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Figure 3 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Figure 4 for TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
Viaarxiv icon

Trustworthy Evaluation of Generative AI Models

Add code
Jan 31, 2025
Viaarxiv icon

MolGraph-xLSTM: A graph-based dual-level xLSTM framework with multi-head mixture-of-experts for enhanced molecular representation and interpretability

Add code
Jan 30, 2025
Viaarxiv icon