Picture for Weijian Li

Weijian Li

Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism

Add code
Dec 30, 2024
Viaarxiv icon

CRTRE: Causal Rule Generation with Target Trial Emulation Framework

Add code
Nov 10, 2024
Figure 1 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Figure 2 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Figure 3 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Figure 4 for CRTRE: Causal Rule Generation with Target Trial Emulation Framework
Viaarxiv icon

Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods

Add code
Jun 20, 2024
Viaarxiv icon

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

Add code
Apr 04, 2024
Figure 1 for Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Figure 2 for Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Figure 3 for Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Figure 4 for Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Viaarxiv icon

BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model

Add code
Apr 04, 2024
Viaarxiv icon

STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction

Add code
Dec 28, 2023
Viaarxiv icon

DocTr: Document Transformer for Structured Information Extraction in Documents

Add code
Jul 16, 2023
Viaarxiv icon

DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome

Add code
Jun 26, 2023
Figure 1 for DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome
Figure 2 for DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome
Figure 3 for DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome
Figure 4 for DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome
Viaarxiv icon

Feature Programming for Multivariate Time Series Prediction

Add code
Jun 09, 2023
Viaarxiv icon

Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding

Add code
Mar 22, 2023
Viaarxiv icon