Picture for Ananth Grama

Ananth Grama

Inference-Time Code Selection via Symbolic Equivalence Partitioning

Add code
Apr 07, 2026
Viaarxiv icon

SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

Add code
Mar 30, 2026
Viaarxiv icon

More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment

Add code
Apr 03, 2025
Figure 1 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Figure 2 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Figure 3 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Figure 4 for More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Viaarxiv icon

No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models

Add code
Oct 24, 2024
Viaarxiv icon

Generalized Learning of Coefficients in Spectral Graph Convolutional Networks

Add code
Sep 07, 2024
Viaarxiv icon

Deconvolving Complex Neuronal Networks into Interpretable Task-Specific Connectomes

Add code
Jun 28, 2024
Figure 1 for Deconvolving Complex Neuronal Networks into Interpretable Task-Specific Connectomes
Figure 2 for Deconvolving Complex Neuronal Networks into Interpretable Task-Specific Connectomes
Figure 3 for Deconvolving Complex Neuronal Networks into Interpretable Task-Specific Connectomes
Figure 4 for Deconvolving Complex Neuronal Networks into Interpretable Task-Specific Connectomes
Viaarxiv icon

Cascade Reward Sampling for Efficient Decoding-Time Alignment

Add code
Jun 24, 2024
Figure 1 for Cascade Reward Sampling for Efficient Decoding-Time Alignment
Figure 2 for Cascade Reward Sampling for Efficient Decoding-Time Alignment
Figure 3 for Cascade Reward Sampling for Efficient Decoding-Time Alignment
Figure 4 for Cascade Reward Sampling for Efficient Decoding-Time Alignment
Viaarxiv icon

Robust Online Classification: From Estimation to Denoising

Add code
Sep 04, 2023
Viaarxiv icon

Online Learning in Dynamically Changing Environments

Add code
Jan 31, 2023
Figure 1 for Online Learning in Dynamically Changing Environments
Viaarxiv icon

Expected Worst Case Regret via Stochastic Sequential Covering

Add code
Sep 17, 2022
Figure 1 for Expected Worst Case Regret via Stochastic Sequential Covering
Viaarxiv icon