Picture for James Glass

James Glass

MIT Computer Science and Artificial Intelligence Laboratory, MA, USA

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Add code
Feb 13, 2025
Viaarxiv icon

mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition

Add code
Feb 03, 2025
Viaarxiv icon

State-Space Large Audio Language Models

Add code
Nov 24, 2024
Viaarxiv icon

Teaching VLMs to Localize Specific Objects from In-context Examples

Add code
Nov 20, 2024
Figure 1 for Teaching VLMs to Localize Specific Objects from In-context Examples
Figure 2 for Teaching VLMs to Localize Specific Objects from In-context Examples
Figure 3 for Teaching VLMs to Localize Specific Objects from In-context Examples
Figure 4 for Teaching VLMs to Localize Specific Objects from In-context Examples
Viaarxiv icon

DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models

Add code
Oct 31, 2024
Viaarxiv icon

A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation

Add code
Oct 29, 2024
Figure 1 for A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Figure 2 for A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Figure 3 for A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Viaarxiv icon

Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback

Add code
Oct 28, 2024
Figure 1 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Figure 2 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Figure 3 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Figure 4 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Viaarxiv icon

Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains

Add code
Oct 24, 2024
Figure 1 for Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains
Figure 2 for Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains
Figure 3 for Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains
Figure 4 for Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains
Viaarxiv icon

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

Add code
Oct 08, 2024
Viaarxiv icon

Quantifying Generalization Complexity for Large Language Models

Add code
Oct 02, 2024
Viaarxiv icon