Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Amir Gershman

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Jan 15, 2026

Aaron Adcock, Aayushi Srivastava, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pande, Abhinav Pandey, Abhinav Sharma, Abhishek Kadian, Abhishek Kumawat, Adam Kelsey(+1295 more)

Abstract:This document consolidates publicly reported technical details about Metas Llama 4 model family. It summarizes (i) released variants (Scout and Maverick) and the broader herd context including the previewed Behemoth teacher model, (ii) architectural characteristics beyond a high-level MoE description covering routed/shared-expert structure, early-fusion multimodality, and long-context design elements reported for Scout (iRoPE and length generalization strategies), (iii) training disclosures spanning pre-training, mid-training for long-context extension, and post-training methodology (lightweight SFT, online RL, and lightweight DPO) as described in release materials, (iv) developer-reported benchmark results for both base and instruction-tuned checkpoints, and (v) practical deployment constraints observed across major serving environments, including provider-specific context limits and quantization packaging. The manuscript also summarizes licensing obligations relevant to redistribution and derivative naming, and reviews publicly described safeguards and evaluation practices. The goal is to provide a compact technical reference for researchers and practitioners who need precise, source-backed facts about Llama 4.

* 15 pages

Via

Access Paper or Ask Questions

Asynchronous Forward Bounding for Distributed COPs

Jan 15, 2014

Amir Gershman, Amnon Meisels, Roie Zivan

Figure 1 for Asynchronous Forward Bounding for Distributed COPs

Figure 2 for Asynchronous Forward Bounding for Distributed COPs

Figure 3 for Asynchronous Forward Bounding for Distributed COPs

Figure 4 for Asynchronous Forward Bounding for Distributed COPs

Abstract:A new search algorithm for solving distributed constraint optimization problems (DisCOPs) is presented. Agents assign variables sequentially and compute bounds on partial assignments asynchronously. The asynchronous bounds computation is based on the propagation of partial assignments. The asynchronous forward-bounding algorithm (AFB) is a distributed optimization search algorithm that keeps one consistent partial assignment at all times. The algorithm is described in detail and its correctness proven. Experimental evaluation shows that AFB outperforms synchronous branch and bound by many orders of magnitude, and produces a phase transition as the tightness of the problem increases. This is an analogous effect to the phase transition that has been observed when local consistency maintenance is applied to MaxCSPs. The AFB algorithm is further enhanced by the addition of a backjumping mechanism, resulting in the AFB-BJ algorithm. Distributed backjumping is based on accumulated information on bounds of all values and on processing concurrently a queue of candidate goals for the next move back. The AFB-BJ algorithm is compared experimentally to other DisCOP algorithms (ADOPT, DPOP, OptAPO) and is shown to be a very efficient algorithm for DisCOPs.

* Journal Of Artificial Intelligence Research, Volume 34, pages 61-88, 2009

Via

Access Paper or Ask Questions