Picture for Roman Novak

Roman Novak

Shammie

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

Add code
Aug 14, 2024
Figure 1 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 2 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 3 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Figure 4 for Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Viaarxiv icon

Scaling Exponents Across Parameterizations and Optimizers

Add code
Jul 08, 2024
Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Dec 22, 2023
Viaarxiv icon

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

Add code
Nov 15, 2023
Viaarxiv icon

Small-scale proxies for large-scale Transformer training instabilities

Add code
Sep 25, 2023
Viaarxiv icon

Fast Neural Kernel Embeddings for General Activations

Add code
Sep 09, 2022
Figure 1 for Fast Neural Kernel Embeddings for General Activations
Figure 2 for Fast Neural Kernel Embeddings for General Activations
Figure 3 for Fast Neural Kernel Embeddings for General Activations
Figure 4 for Fast Neural Kernel Embeddings for General Activations
Viaarxiv icon

Fast Finite Width Neural Tangent Kernel

Add code
Jun 17, 2022
Figure 1 for Fast Finite Width Neural Tangent Kernel
Figure 2 for Fast Finite Width Neural Tangent Kernel
Figure 3 for Fast Finite Width Neural Tangent Kernel
Figure 4 for Fast Finite Width Neural Tangent Kernel
Viaarxiv icon

Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling

Add code
Jun 15, 2022
Figure 1 for Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Figure 2 for Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Figure 3 for Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Figure 4 for Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Viaarxiv icon

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Add code
Jun 10, 2022
Viaarxiv icon

Dataset Distillation with Infinitely Wide Convolutional Networks

Add code
Jul 27, 2021
Figure 1 for Dataset Distillation with Infinitely Wide Convolutional Networks
Figure 2 for Dataset Distillation with Infinitely Wide Convolutional Networks
Figure 3 for Dataset Distillation with Infinitely Wide Convolutional Networks
Figure 4 for Dataset Distillation with Infinitely Wide Convolutional Networks
Viaarxiv icon