Online tutorials

Downloads

Multivox 4

Hungarian text-to-speech converter. Downloadable and free for any use

 
 

Speech analyser for displaying the formants

(a robust formant tracker)

   

Speech and speech technology
a Hungarian tutorial CD-ROM
by Kovács, M. & Olaszy, G. & Nikléczy, P. & Gósy, M.

Magyar nyelvi beszédtechnológiai alapismeretek

Run the program

The aim of making this tutorial was to build a bridge between the human and real science, concentrating on one special new trend in speech communications, namely the one between man and machine. A linguist can get information about the acoustic structure of speech and sounds, and also about the problems we are facing in solving different speech technology applications. An engineer can study the physiological and phonetic structure of speech, and can see how the linguistic knowledge is built in into speech technology solutions. The CD-ROM (550 Mbytes, 4600 files) presents the material totally on 620 screen pages in the form of pictures, texts, speech, animations and examples.

Speech technology is a new field in science, it has begun to develop rapidly in the last decades of the 20 th century. It takes the former and newest results of traditional speech research (linguistics, phonetics), and uses them in the frames of artificial intelligence in the reserch of information technology and applications. The field of these applications is rather wide, the most important areas are: speech recognition, speaker verification through voice, automatic speech generation from text, speech therapy, new types of hearing measurements etc.

The material of the CD shows how wide, interdisciplinary knowledge is needed (linguistics, phonetics, physiology, acoustics, mathematics, informatics, electrical engineering) if we want to substitute one side of the human verbal communication process with a machine. Good results in speech technology research and applications can be achieved only if we combine the knowledge of all these fields and find a common language among them.

The CD uses the up to date multimedia technology combined with some digital speech processing techniques like slow down-speed up the speech item, show the melody form, zoom the waveform, make timing measurements among sounds, make visible the sound boundaries of the speech item etc.)

The content of the CD is organised in three level hierarchical orders. Nine main chapters cover the whole topic from speech production and hearing until digital speech processing. The main chapters contain subchapters as seen below.

 

Contents

The subchapters are divided into detailed topics. For example subchapter 7.1 contains five topics: 7.1.1 Kempelen Farkas; 7.1.2 Reconstruction of Kempelen’s speaking machine; 7.1.3 Speech sounds, syllables and words generated in 2001 with the technology from 1791; 7.1.4 The first patent for TTS conversion of the world; 7.1.5 The first Hungarian TTS system.

The material of the CD is written by scientific aspirations but in everyday style. Therefore a wide range of people can use is with success, from students to researchers, teachers, engineers. A dictionary of expressions helps also to get to know the meaning of the scientific terms.

The skeleton of this CD makes it possible to develop similar material for other languages.
For details, please send e-mail to olaszy@nytud.hu


 

In memoriam Kempelen Farkas

In memoriam Kempelen Farkas

Run the program

The program is aimed at paying a tribute of respect to Farkas Kempelen and his work in phonetics.

Farkas Kempelen (Wolfgang von Kempelen) the polymath is considered to have been the first phonetician of the world since he created the first mechanical instrument that was able to emit signals very similar to human speech. He went on perfecting his speaking machine for 22 years and described the final version in his book published in 1791 in Vienna. The machine spoke in a child’s voice, in several languages, and it was possible to produce words and short sentences with it. Without knowing it, Kempelen started a new era with his speaking machine: the age of speech-mediated man-machine interaction that is beginning to be accomplished now, in the twenty-first century.

The multimedia aplication "In memoriam Kempelen Farkas" presents the reconstructed working replica of Kempelen’s speaking machine. The replica was designed and built by Péter Nikléczy and Dr Gábor Olaszy in 2001. Thus we can go back 210 years and study the working of one of the most precious instruments of the Baroque period. The acoustic patterns of the machine’s speech can be studied by today’s sophisticated signal processing methods and prove or disprove Kempelen’s claims by measurement data.