Picture for Jiantao Jiao

Jiantao Jiao

How to Evaluate Reward Models for RLHF

Add code
Oct 18, 2024
Viaarxiv icon

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

Add code
Oct 17, 2024
Figure 1 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 2 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 3 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 4 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Viaarxiv icon

Thinking LLMs: General Instruction Following with Thought Generation

Add code
Oct 14, 2024
Viaarxiv icon

EmbedLLM: Learning Compact Representations of Large Language Models

Add code
Oct 03, 2024
Viaarxiv icon

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Add code
Jul 28, 2024
Viaarxiv icon

Universal evaluation and design of imaging systems using information estimation

Add code
May 31, 2024
Viaarxiv icon

Toxicity Detection for Free

Add code
May 29, 2024
Viaarxiv icon

Toward a Theory of Tokenization in LLMs

Add code
Apr 12, 2024
Viaarxiv icon

Generative AI Security: Challenges and Countermeasures

Add code
Feb 20, 2024
Figure 1 for Generative AI Security: Challenges and Countermeasures
Figure 2 for Generative AI Security: Challenges and Countermeasures
Figure 3 for Generative AI Security: Challenges and Countermeasures
Figure 4 for Generative AI Security: Challenges and Countermeasures
Viaarxiv icon

Efficient Prompt Caching via Embedding Similarity

Add code
Feb 02, 2024
Viaarxiv icon