Published November 27, 2014 | Version v1
Publication

Integranting prosodic information into a speech recogniser

Description

In the last decade there has been an increasing tendency to incorporate language engineering strategies into speech technology. This technique combines linguistic and mathematical information in different applications: machine translation, natural language processing, speech synthesis and automatic speech recognition (ASR). In the field of speech synthesis, this hybrid approach (linguistic and mathematical/statistical) has led to the design of efficient models for reproducing the acoustic features of natural language. However, the incorporation of language engineering strategies into ASR is only beginning. In this paper, we present a theoretical framework for the integration of linguistic information into an ASR system. The objective is to design a model which can detect the suprasegmental features of the speech input, mainly those related to the fundamental frequency (F0) that can clarify the functionality of pauses, intonation contour, and interruptions. This specification model has been designed in the framework of a dialogue system

Additional details

Created:
March 27, 2023
Modified:
December 1, 2023