Picture for Jean-Francois Ton

Jean-Francois Ton

Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs

Add code
Feb 04, 2025
Viaarxiv icon

Understanding Chain-of-Thought in LLMs through Information Theory

Add code
Nov 18, 2024
Viaarxiv icon

Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives

Add code
Nov 07, 2024
Viaarxiv icon

ACC-Debate: An Actor-Critic Approach to Multi-Agent Debate

Add code
Nov 04, 2024
Viaarxiv icon

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

Add code
Mar 08, 2024
Viaarxiv icon

Dataset Fairness: Achievable Fairness on Your Data With Utility Guarantees

Add code
Feb 27, 2024
Viaarxiv icon

Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

Add code
Feb 16, 2024
Viaarxiv icon

Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits

Add code
Dec 03, 2023
Viaarxiv icon

Deep Concept Removal

Add code
Oct 09, 2023
Figure 1 for Deep Concept Removal
Figure 2 for Deep Concept Removal
Figure 3 for Deep Concept Removal
Figure 4 for Deep Concept Removal
Viaarxiv icon

Invariant Learning via Probability of Sufficient and Necessary Causes

Add code
Sep 22, 2023
Viaarxiv icon