Three large corpora with a total size of more than 500 hours of spontaneous spoken Hungarian monologue and dialogue will be — partly manually, partly automatically — segmented, annotated and made publicly available for research. Segmentations and annotations will cover the phoneme, word, and prosodic phrase level.
The state of the art machine learning models developed for this purpose, namely an ASR and a text-speech alignment system for Hungarian based on deep learning, and models for automatic prosodic annotation will be made publicly available for research. For details, please, visit https://phon.nytud.hu/bea/ !
Backed by the large spontaneous speech data, grammar-discourse interface models will be tested, refined, and revised. We will provide a phonetically based prosodic phrasing system for Hungarian and its annotation scheme. Finally, light will be shed on how to disentangle system and spontaneous speech influences on the speech acoustics of prosodic phrasing.
In sum, this project first provides linguistic and phonetic insights based on analyses of large spontaneous speech databases. Second, it provides insights and the technology, how to automatically segment and annotate large speech data, in order to, third, enable other researchers to work with the introduced corpora, or to process their own data in the future.