Picture for Tong Yang

Tong Yang

Michael Pokorny

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

Add code
Mar 06, 2025
Viaarxiv icon

ReFocus: Reinforcing Mid-Frequency and Key-Frequency Modeling for Multivariate Time Series Forecasting

Add code
Feb 24, 2025
Viaarxiv icon

FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference

Add code
Feb 19, 2025
Viaarxiv icon

LLM-Sketch: Enhancing Network Sketches with LLM

Add code
Feb 11, 2025
Viaarxiv icon

Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning

Add code
Feb 06, 2025
Figure 1 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Figure 2 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Figure 3 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Figure 4 for Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning
Viaarxiv icon

Predicting 3D representations for Dynamic Scenes

Add code
Jan 28, 2025
Figure 1 for Predicting 3D representations for Dynamic Scenes
Figure 2 for Predicting 3D representations for Dynamic Scenes
Figure 3 for Predicting 3D representations for Dynamic Scenes
Figure 4 for Predicting 3D representations for Dynamic Scenes
Viaarxiv icon

CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

Add code
Jan 25, 2025
Figure 1 for CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter
Figure 2 for CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter
Figure 3 for CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter
Figure 4 for CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter
Viaarxiv icon

Humanity's Last Exam

Add code
Jan 24, 2025
Viaarxiv icon

Inference-to-complete: A High-performance and Programmable Data-plane Co-processor for Neural-network-driven Traffic Analysis

Add code
Nov 01, 2024
Figure 1 for Inference-to-complete: A High-performance and Programmable Data-plane Co-processor for Neural-network-driven Traffic Analysis
Figure 2 for Inference-to-complete: A High-performance and Programmable Data-plane Co-processor for Neural-network-driven Traffic Analysis
Figure 3 for Inference-to-complete: A High-performance and Programmable Data-plane Co-processor for Neural-network-driven Traffic Analysis
Figure 4 for Inference-to-complete: A High-performance and Programmable Data-plane Co-processor for Neural-network-driven Traffic Analysis
Viaarxiv icon

Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment

Add code
Oct 28, 2024
Viaarxiv icon