Picture for Yuxin Wu

Yuxin Wu

Muon is Scalable for LLM Training

Add code
Feb 24, 2025
Viaarxiv icon

MoBA: Mixture of Block Attention for Long-Context LLMs

Add code
Feb 18, 2025
Viaarxiv icon

A Plug-and-Play Bregman ADMM Module for Inferring Event Branches in Temporal Point Processes

Add code
Jan 08, 2025
Figure 1 for A Plug-and-Play Bregman ADMM Module for Inferring Event Branches in Temporal Point Processes
Figure 2 for A Plug-and-Play Bregman ADMM Module for Inferring Event Branches in Temporal Point Processes
Figure 3 for A Plug-and-Play Bregman ADMM Module for Inferring Event Branches in Temporal Point Processes
Figure 4 for A Plug-and-Play Bregman ADMM Module for Inferring Event Branches in Temporal Point Processes
Viaarxiv icon

Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity

Add code
Nov 15, 2024
Figure 1 for Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity
Figure 2 for Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity
Figure 3 for Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity
Figure 4 for Layer Importance and Hallucination Analysis in Large Language Models via Enhanced Activation Variance-Sparsity
Viaarxiv icon

AVSS: Layer Importance Evaluation in Large Language Models via Activation Variance-Sparsity Analysis

Add code
Nov 04, 2024
Viaarxiv icon

FlamePINN-1D: Physics-informed neural networks to solve forward and inverse problems of 1D laminar flames

Add code
Jun 07, 2024
Viaarxiv icon

CARNA: Characterizing Advanced heart failure Risk and hemodyNAmic phenotypes using learned multi-valued decision diagrams

Add code
Jun 11, 2023
Viaarxiv icon

Dissimilar Nodes Improve Graph Active Learning

Add code
Dec 05, 2022
Figure 1 for Dissimilar Nodes Improve Graph Active Learning
Figure 2 for Dissimilar Nodes Improve Graph Active Learning
Figure 3 for Dissimilar Nodes Improve Graph Active Learning
Figure 4 for Dissimilar Nodes Improve Graph Active Learning
Viaarxiv icon

Rethinking "Batch" in BatchNorm

Add code
May 17, 2021
Figure 1 for Rethinking "Batch" in BatchNorm
Figure 2 for Rethinking "Batch" in BatchNorm
Figure 3 for Rethinking "Batch" in BatchNorm
Figure 4 for Rethinking "Batch" in BatchNorm
Viaarxiv icon

PointRend: Image Segmentation as Rendering

Add code
Dec 17, 2019
Figure 1 for PointRend: Image Segmentation as Rendering
Figure 2 for PointRend: Image Segmentation as Rendering
Figure 3 for PointRend: Image Segmentation as Rendering
Figure 4 for PointRend: Image Segmentation as Rendering
Viaarxiv icon