Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rahul Bapusaheb Kodag

Weakly Supervised Tabla Stroke Transcription via TI-SDRM: A Rhythm-Aware Lattice Rescoring Framework

Jan 13, 2026

Rahul Bapusaheb Kodag, Vipul Arora

Abstract:Tabla Stroke Transcription (TST) is central to the analysis of rhythmic structure in Hindustani classical music, yet remains challenging due to complex rhythmic organization and the scarcity of strongly annotated data. Existing approaches largely rely on fully supervised learning with onset-level annotations, which are costly and impractical at scale. This work addresses TST in a weakly supervised setting, using only symbolic stroke sequences without temporal alignment. We propose a framework that combines a CTC-based acoustic model with sequence-level rhythmic rescoring. The acoustic model produces a decoding lattice, which is refined using a \textbf{$T\bar{a}la$}-Independent Static--Dynamic Rhythmic Model (TI-SDRM) that integrates long-term rhythmic structure with short-term adaptive dynamics through an adaptive interpolation mechanism. We curate a new real-world tabla solo dataset and a complementary synthetic dataset, establishing the first benchmark for weakly supervised TST in Hindustani classical music. Experiments demonstrate consistent and substantial reductions in stroke error rate over acoustic-only decoding, confirming the importance of explicit rhythmic structure for accurate transcription.

Via

Access Paper or Ask Questions

Meta-learning-based percussion transcription and $t\bar{a}la$ identification from low-resource audio

Jan 08, 2025

Rahul Bapusaheb Kodag, Vipul Arora

$Figure 1 for Meta-learning-based percussion transcription and $t\bar{a}la$ identification from low-resource audio$

$Figure 2 for Meta-learning-based percussion transcription and $t\bar{a}la$ identification from low-resource audio$

$Figure 3 for Meta-learning-based percussion transcription and $t\bar{a}la$ identification from low-resource audio$

$Figure 4 for Meta-learning-based percussion transcription and $t\bar{a}la$ identification from low-resource audio$

Abstract:This study introduces a meta-learning-based approach for low-resource Tabla Stroke Transcription (TST) and $t\bar{a}la$ identification in Hindustani classical music. Using Model-Agnostic Meta-Learning (MAML), we address the challenge of limited annotated datasets, enabling rapid adaptation to new tasks with minimal data. The method is validated across various datasets, including tabla solo and concert recordings, demonstrating robustness in polyphonic audio scenarios. We propose two novel $t\bar{a}la$ identification techniques based on stroke sequences and rhythmic patterns. Additionally, the approach proves effective for Automatic Drum Transcription (ADT), showcasing its flexibility for Indian and Western percussion music. Experimental results show that the proposed method outperforms existing techniques in low-resource settings, significantly contributing to music transcription and studying musical traditions through computational tools.

* arXiv admin note: substantial text overlap with arXiv:2407.20935

Via

Access Paper or Ask Questions

$T\bar{a}laGen:$ A System for Automatic $T\bar{a}la$ Identification and Generation

Jul 30, 2024

Rahul Bapusaheb Kodag, Himanshu Jindal, Vipul Arora

Abstract:In Hindustani classical music, the tabla plays an important role as a rhythmic backbone and accompaniment. In applications like computer-based music analysis, learning singing, and learning musical instruments, tabla stroke transcription, $t\bar{a}la$ identification, and generation are crucial. This paper proposes a comprehensive system aimed at addressing these challenges. For tabla stroke transcription, we propose a novel approach based on model-agnostic meta-learning (MAML) that facilitates the accurate identification of tabla strokes using minimal data. Leveraging these transcriptions, the system introduces two novel $t\bar{a}la$ identification methods based on the sequence analysis of tabla strokes. \par Furthermore, the paper proposes a framework for $t\bar{a}la$ generation to bridge traditional and modern learning methods. This framework utilizes finite state transducers (FST) and linear time-invariant (LTI) filters to generate $t\bar{a}las$ with real-time tempo control through user interaction, enhancing practice sessions and musical education. Experimental evaluations on tabla solo and concert datasets demonstrate the system's exceptional performance on real-world data and its ability to outperform existing methods. Additionally, the proposed $t\bar{a}la$ identification methods surpass state-of-the-art techniques. The contributions of this paper include a combined approach to tabla stroke transcription, innovative $t\bar{a}la$ identification techniques, and a robust framework for $t\bar{a}la$ generation that handles the rhythmic complexities of Hindustani music.

Via

Access Paper or Ask Questions