Cortical circuits rely on the temporal regularities of speech to optimize signal parsing for sound-to-meaning mapping. Bottom-up speech analysis is accelerated by top-down predictions about upcoming words. In everyday communications, however, listeners are regularly presented with challenging input-fluctuations of speech rate or semantic content. In this study, we asked how reducing speech temporal regularity affects its processing-parsing, phonological analysis, and ability to generate context-based predictions. To ensure that spoken sentences were natural and approximated semantic constraints of spontaneous speech we built a neural network to select stimuli from large corpora. We analyzed brain activity recorded with magnetoencephalography during sentence listening using evoked responses, speech-to-brain synchronization and representational similarity analysis. For normal speech theta band (6.5-8 Hz) speech-to-brain synchronization was increased and the left fronto-temporal areas generated stronger contextual predictions. The reverse was true for temporally irregular speech-weaker theta synchronization and reduced top-down effects. Interestingly, delta-band (0.5 Hz) speech tracking was greater when contextual/semantic predictions were lower or if speech was temporally jittered. We conclude that speech temporal regularity is relevant for (theta) syllabic tracking and robust semantic predictions while the joint support of temporal and contextual predictability reduces word and phrase-level cortical tracking (delta).
Keywords: MEG; coherence; neural network; phonological processing; representational similarity analysis; semantic predictions.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.