Picture for Jonathan Cohen

Jonathan Cohen

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers

Add code
May 19, 2025
Viaarxiv icon

Understanding Task Representations in Neural Networks via Bayesian Ablation

Add code
May 19, 2025
Figure 1 for Understanding Task Representations in Neural Networks via Bayesian Ablation
Figure 2 for Understanding Task Representations in Neural Networks via Bayesian Ablation
Figure 3 for Understanding Task Representations in Neural Networks via Bayesian Ablation
Figure 4 for Understanding Task Representations in Neural Networks via Bayesian Ablation
Viaarxiv icon

Llama-Nemotron: Efficient Reasoning Models

Add code
May 02, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Figure 1 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 2 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 3 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 4 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Viaarxiv icon

Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models

Add code
Feb 27, 2025
Figure 1 for Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models
Figure 2 for Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models
Figure 3 for Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models
Figure 4 for Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models
Viaarxiv icon

Nemotron-4 340B Technical Report

Add code
Jun 17, 2024
Figure 1 for Nemotron-4 340B Technical Report
Figure 2 for Nemotron-4 340B Technical Report
Figure 3 for Nemotron-4 340B Technical Report
Figure 4 for Nemotron-4 340B Technical Report
Viaarxiv icon

Nemotron-4 15B Technical Report

Add code
Feb 27, 2024
Figure 1 for Nemotron-4 15B Technical Report
Figure 2 for Nemotron-4 15B Technical Report
Figure 3 for Nemotron-4 15B Technical Report
Figure 4 for Nemotron-4 15B Technical Report
Viaarxiv icon

NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails

Add code
Oct 16, 2023
Viaarxiv icon

Beyond Transformers for Function Learning

Add code
Apr 19, 2023
Viaarxiv icon