Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Simon Todd

More than Just Statistical Recurrence: Human and Machine Unsupervised Learning of Māori Word Segmentation across Morphological Processes

Mar 21, 2024

Ashvini Varatharaj, Simon Todd

Abstract:Non-M\=aori-speaking New Zealanders (NMS)are able to segment M\=aori words in a highlysimilar way to fluent speakers (Panther et al.,2024). This ability is assumed to derive through the identification and extraction of statistically recurrent forms. We examine this assumption by asking how NMS segmentations compare to those produced by Morfessor, an unsupervised machine learning model that operates based on statistical recurrence, across words formed by a variety of morphological processes. Both NMS and Morfessor succeed in segmenting words formed by concatenative processes (compounding and affixation without allomorphy), but NMS also succeed for words that invoke templates (reduplication and allomorphy) and other cues to morphological structure, implying that their learning process is sensitive to more than just statistical recurrence.

* 10 pages, 1 Figure, 2 tables

Via

Access Paper or Ask Questions

PSST! Prosodic Speech Segmentation with Transformers

Feb 03, 2023

Nathan Roll, Calbert Graham, Simon Todd

Figure 1 for PSST! Prosodic Speech Segmentation with Transformers

Figure 2 for PSST! Prosodic Speech Segmentation with Transformers

Figure 3 for PSST! Prosodic Speech Segmentation with Transformers

Figure 4 for PSST! Prosodic Speech Segmentation with Transformers

Abstract:Self-attention mechanisms have enabled transformers to achieve superhuman-level performance on many speech-to-text (STT) tasks, yet the challenge of automatic prosodic segmentation has remained unsolved. In this paper we finetune Whisper, a pretrained STT model, to annotate intonation unit (IU) boundaries by repurposing low-frequency tokens. Our approach achieves an accuracy of 95.8%, outperforming previous methods without the need for large-scale labeled data or enterprise grade compute resources. We also diminish input signals by applying a series of filters, finding that low pass filters at a 3.2 kHz level improve segmentation performance in out of sample and out of distribution contexts. We release our model as both a transcription tool and a baseline for further improvements in prosodic segmentation.

* 5 pages, 3 figures. For associated repository, see https://github.com/Nathan-Roll1/psst

Via

Access Paper or Ask Questions