Picture for Michael I. Jordan

Michael I. Jordan

The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective

Add code
Jan 27, 2025
Viaarxiv icon

Conformal Prediction Sets with Improved Conditional Coverage using Trust Scores

Add code
Jan 17, 2025
Viaarxiv icon

Gradient Equilibrium in Online Learning: Theory and Applications

Add code
Jan 14, 2025
Viaarxiv icon

An Optimistic Algorithm for Online Convex Optimization with Adversarial Constraints

Add code
Dec 11, 2024
Viaarxiv icon

Dimension-free Private Mean Estimation for Anisotropic Distributions

Add code
Nov 01, 2024
Viaarxiv icon

Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy

Add code
Oct 24, 2024
Viaarxiv icon

Optimal Design for Reward Modeling in RLHF

Add code
Oct 23, 2024
Figure 1 for Optimal Design for Reward Modeling in RLHF
Figure 2 for Optimal Design for Reward Modeling in RLHF
Viaarxiv icon

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

Add code
Oct 17, 2024
Figure 1 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 2 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 3 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 4 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Viaarxiv icon

Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry

Add code
Sep 05, 2024
Viaarxiv icon

Defection-Free Collaboration between Competitors in a Learning System

Add code
Jun 22, 2024
Viaarxiv icon