Picture for Jongsoo Park

Jongsoo Park

Tony

OpenAI GPT-5 System Card

Add code
Dec 19, 2025
Viaarxiv icon

Meta Lattice: Model Space Redesign for Cost-Effective Industry-Scale Ads Recommendations

Add code
Dec 15, 2025
Viaarxiv icon

Context Parallelism for Scalable Million-Token Inference

Add code
Nov 04, 2024
Figure 1 for Context Parallelism for Scalable Million-Token Inference
Figure 2 for Context Parallelism for Scalable Million-Token Inference
Figure 3 for Context Parallelism for Scalable Million-Token Inference
Figure 4 for Context Parallelism for Scalable Million-Token Inference
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

Wukong: Towards a Scaling Law for Large-Scale Recommendation

Add code
Mar 08, 2024
Figure 1 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Figure 2 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Figure 3 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Figure 4 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Viaarxiv icon

Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation

Add code
Mar 07, 2024
Figure 1 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Figure 2 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Figure 3 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Figure 4 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Viaarxiv icon

MTrainS: Improving DLRM training efficiency using heterogeneous memories

Add code
Apr 19, 2023
Figure 1 for MTrainS: Improving DLRM training efficiency using heterogeneous memories
Figure 2 for MTrainS: Improving DLRM training efficiency using heterogeneous memories
Figure 3 for MTrainS: Improving DLRM training efficiency using heterogeneous memories
Figure 4 for MTrainS: Improving DLRM training efficiency using heterogeneous memories
Viaarxiv icon

Shared Microexponents: A Little Shifting Goes a Long Way

Add code
Feb 16, 2023
Figure 1 for Shared Microexponents: A Little Shifting Goes a Long Way
Figure 2 for Shared Microexponents: A Little Shifting Goes a Long Way
Figure 3 for Shared Microexponents: A Little Shifting Goes a Long Way
Figure 4 for Shared Microexponents: A Little Shifting Goes a Long Way
Viaarxiv icon

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

Add code
Nov 14, 2022
Figure 1 for RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure
Figure 2 for RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure
Figure 3 for RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure
Figure 4 for RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure
Viaarxiv icon

DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction

Add code
Mar 11, 2022
Figure 1 for DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction
Figure 2 for DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction
Figure 3 for DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction
Figure 4 for DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction
Viaarxiv icon