Picture for Bhiksha Raj

Bhiksha Raj

Language Technologies Institute, Carnegie Mellon University, Mohammed bin Zayed University of AI

Tessellated Linear Model for Age Prediction from Voice

Add code
Jan 16, 2025
Viaarxiv icon

SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer

Add code
Dec 14, 2024
Viaarxiv icon

XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation

Add code
Dec 02, 2024
Viaarxiv icon

Perturbation Ontology based Graph Attention Networks

Add code
Nov 27, 2024
Viaarxiv icon

MACE: Leveraging Audio for Evaluating Audio Captioning Systems

Add code
Nov 05, 2024
Figure 1 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Figure 2 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Figure 3 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Figure 4 for MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Viaarxiv icon

FLAASH: Flow-Attention Adaptive Semantic Hierarchical Fusion for Multi-Modal Tobacco Content Analysis

Add code
Oct 25, 2024
Viaarxiv icon

On the Diversity of Synthetic Data and its Impact on Training Large Language Models

Add code
Oct 19, 2024
Figure 1 for On the Diversity of Synthetic Data and its Impact on Training Large Language Models
Figure 2 for On the Diversity of Synthetic Data and its Impact on Training Large Language Models
Figure 3 for On the Diversity of Synthetic Data and its Impact on Training Large Language Models
Figure 4 for On the Diversity of Synthetic Data and its Impact on Training Large Language Models
Viaarxiv icon

What Do Speech Foundation Models Not Learn About Speech?

Add code
Oct 16, 2024
Figure 1 for What Do Speech Foundation Models Not Learn About Speech?
Figure 2 for What Do Speech Foundation Models Not Learn About Speech?
Figure 3 for What Do Speech Foundation Models Not Learn About Speech?
Figure 4 for What Do Speech Foundation Models Not Learn About Speech?
Viaarxiv icon

Improving Speaker Representations Using Contrastive Losses on Multi-scale Features

Add code
Oct 07, 2024
Viaarxiv icon

RelUNet: Relative Channel Fusion U-Net for Multichannel Speech Enhancement

Add code
Oct 07, 2024
Figure 1 for RelUNet: Relative Channel Fusion U-Net for Multichannel Speech Enhancement
Figure 2 for RelUNet: Relative Channel Fusion U-Net for Multichannel Speech Enhancement
Figure 3 for RelUNet: Relative Channel Fusion U-Net for Multichannel Speech Enhancement
Figure 4 for RelUNet: Relative Channel Fusion U-Net for Multichannel Speech Enhancement
Viaarxiv icon