Picture for Jianyong Wang

Jianyong Wang

Universal YOCO for Efficient Depth Scaling

Add code
Apr 01, 2026
Viaarxiv icon

Geometric Autoencoder for Diffusion Models

Add code
Mar 12, 2026
Viaarxiv icon

Rectified Sparse Attention

Add code
Jun 05, 2025
Viaarxiv icon

Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics

Add code
May 01, 2025
Figure 1 for Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Figure 2 for Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Figure 3 for Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Figure 4 for Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Viaarxiv icon

COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction

Add code
Mar 18, 2025
Figure 1 for COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction
Figure 2 for COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction
Figure 3 for COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction
Figure 4 for COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction
Viaarxiv icon

FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling

Add code
Feb 20, 2025
Viaarxiv icon

Multimodal Latent Language Modeling with Next-Token Diffusion

Add code
Dec 11, 2024
Figure 1 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 2 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 3 for Multimodal Latent Language Modeling with Next-Token Diffusion
Figure 4 for Multimodal Latent Language Modeling with Next-Token Diffusion
Viaarxiv icon

Are LLM-based Recommenders Already the Best? Simple Scaled Cross-entropy Unleashes the Potential of Traditional Sequential Recommenders

Add code
Aug 26, 2024
Figure 1 for Are LLM-based Recommenders Already the Best? Simple Scaled Cross-entropy Unleashes the Potential of Traditional Sequential Recommenders
Figure 2 for Are LLM-based Recommenders Already the Best? Simple Scaled Cross-entropy Unleashes the Potential of Traditional Sequential Recommenders
Figure 3 for Are LLM-based Recommenders Already the Best? Simple Scaled Cross-entropy Unleashes the Potential of Traditional Sequential Recommenders
Figure 4 for Are LLM-based Recommenders Already the Best? Simple Scaled Cross-entropy Unleashes the Potential of Traditional Sequential Recommenders
Viaarxiv icon

FocusLLM: Scaling LLM's Context by Parallel Decoding

Add code
Aug 21, 2024
Figure 1 for FocusLLM: Scaling LLM's Context by Parallel Decoding
Figure 2 for FocusLLM: Scaling LLM's Context by Parallel Decoding
Figure 3 for FocusLLM: Scaling LLM's Context by Parallel Decoding
Figure 4 for FocusLLM: Scaling LLM's Context by Parallel Decoding
Viaarxiv icon

You Only Cache Once: Decoder-Decoder Architectures for Language Models

Add code
May 08, 2024
Viaarxiv icon