Picture for Fedor Moiseev

Fedor Moiseev

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models with Same Tower Negatives

Add code
Jun 05, 2023
Figure 1 for SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models with Same Tower Negatives
Figure 2 for SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models with Same Tower Negatives
Figure 3 for SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models with Same Tower Negatives
Figure 4 for SamToNe: Improving Contrastive Loss for Dual Encoder Retrieval Models with Same Tower Negatives
Viaarxiv icon

SKILL: Structured Knowledge Infusion for Large Language Models

Add code
May 17, 2022
Figure 1 for SKILL: Structured Knowledge Infusion for Large Language Models
Figure 2 for SKILL: Structured Knowledge Infusion for Large Language Models
Figure 3 for SKILL: Structured Knowledge Infusion for Large Language Models
Figure 4 for SKILL: Structured Knowledge Infusion for Large Language Models
Viaarxiv icon

Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned

Add code
Jun 07, 2019
Figure 1 for Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Figure 2 for Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Figure 3 for Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Figure 4 for Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Viaarxiv icon