The production of artificial speech for output by a D/A converter. The input is typically a file of text. The synthesis may be performed either by hardware or software, the Klatt synthesizer being an example. The methodology employed to produce the speech signal may be by concatenation of prerecorded words or of elements of real speech (such as phonemes), or by pure synthesis driven by data derived from a complex analysis of the input text.
The quality of the speech produced depends greatly on the techniques employed at both the lexical analysis phase and the synthesis phase. The particular problem with phoneme concatenation is how to actually perform the join since fluent speech requires fluent transitions. Other commonly used elements of speech are demisyllables, syllables, words, and word systems. As the units become longer, so the quality increases, but then so does the storage requirement and the processing overhead.