INTRODUCTION |
In speech synthesis, a human voice is generated by the computer. A text-to-speech system (TTS) converts written text into a speech output. The automatic generation of human language is complicated, but it has made a lot of progress in recent years. Compared to the playback of pre-made voice recordings, TTS has the advantage of being very flexible and able to speak any text. Speech synthesis is a part of computational linguistics. Therefore, a close collaboration between linguists and computer scientists is necessary in the development of a TTS. The speech synthesis software used in TigerJython is called MaryTTS and was developed at the Department of Computational Linguistics and Phonetics of the University of Saarland in Germany. The system uses large library files that you download separately here and then unzip. In the same directory as tigerjython2.jar, create the subdirectory Lib (only if it does not already exist) and copy the unzipped files into it. PROGRAMMING CONCEPTS:
Speech synthesis, artificial speech, text-to-speech system |
SPEAKING A TEXT IN 4 LANGUAGES |
IIn this release, MaryTTS provides you with different voices speaking German, English, French and Italien. You can choose the voice with selectVoice(). After that you can call the function generateVoice() by passing it the text to be spoken. It will return a list with the generated sound samples that you can play back with a sound player. from soundsystem import * initTTS() selectVoice("german-man") #selectVoice("german-woman") #selectVoice("english-man") #selectVoice("english-woman") #selectVoice("french-woman") #selectVoice("french-man") #selectVoice("italian-woman") text = "Danke dass du mir eine Sprache gibst. Viel Spass beim Programmieren" #text = "Thank you to give me a voice. Enjoy programming" #text = "Merci pour me donner une voix. Profitez de la programmation" #text = "Grazie che tu mi dia una lingua. Godere della programmazione" voice = generateVoice(text) openSoundPlayer(voice) play()
|
MEMO |
You can change the commented lines to let the program speak the text using the different voices. You first always have to call initTTS() in order to prepare the speech synthesis software. You could also pass the function initTTS() a path to the directory containing the MaryTTS data files as a parameter. By default it is the subdirectory Lib. |
ANNOUNCING TODAY'S DATE AND THE CURRENT TIME |
There are numerous applications of speech synthesis. People with visual impairments can have texts read aloud to them, and navigation systems or train station or train announcements often use synthetically generated voices. Many interactive computer games also use artificially generated voices. from soundsystem import * import datetime language = "german" #language = "english" #language = "french" initTTS() if language == "german": selectVoice("german-woman") month = ["Januar", "Februar", "März", "April", "Mai", "Juni", "Juli", "August", "September", "Oktober", "November", "Dezember"] if language == "english": selectVoice("english-man") month = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"] if language == "french": selectVoice("french-man") month = ["Janvier", "Février", "Mars", "Avril", "Mai", "Juin", "Juillet", "Aout", "Septembre", "Octobre", "Novembre", "Décembre"] now = datetime.datetime.now() if language == "german": text = "Heute ist der " + str(now.day) + ". " \ + month[now.month - 1] + " " + str(now.year) + ".\n" \ + "Die genaue Zeit ist " + str(now.hour) + " Uhr " + str(now.minute) if language == "english": text = "Today we have " + month[now.month - 1] + " " \ + str(now.day) + ", "+ str(now.year) + ".\n" \ + "The time is " + str(now.hour) + " hours " + str(now.minute) \ + " minutes." if language == "french": text = "Nous sommes le " + str(now.day) + " " \ + month[now.month - 1] + " " + str(now.year) + ".\n" \ + "Il est exactement " + str(now.hour) + " heures " \ + str(now.minute) + " minutes." print(text) voice = generateVoice(text) openSoundPlayer(voice) play()
|
MEMO |
By selecting the commented lines, you can decide between the German or the English speaker. The class datetime.datetime.now() provides you with information about the current date and the current time, via its attributes year, month, day, hour, minute, second, microsecond. As you can see, you can use the backslash as a line extension in the definition of long strings. |
CREATING YOUR OWN GRAPHICAL USER INTERFACE |
As you have already learned in chapter 3.13 it is quite easy to create a simple dialog window based on TigerJython's EntryDialog class. As usual in many programming environments the classic controls like text fields, push, check and radio buttons, as well as sliders are modeled by software objects. These objects appear in a surrounding rectangular pane and the dialog remains open while the program continues (such a dialog is called a modeless dialog). For a comprehensive information please consult the APLU documentation. Your program opens a modeless dialog where you select the speaker using radio buttons. When clicking the confirmation button, the text in the text field is read by a synthetic voice.
from soundsystem import * from entrydialog import * speaker1 = RadioEntry("Mann (Deutsch)") speaker1.setValue(True) speaker2 = RadioEntry("Man (English)") speaker3 = RadioEntry("Homme (Français)") speaker4 = RadioEntry("Donna (Italiano)") pane1 = EntryPane("Speaker Selection", speaker1, speaker2, speaker3, speaker4) textEntry = StringEntry("Message:", "Viel Spass am Programmieren") pane2 = EntryPane(textEntry) okButton = ButtonEntry("Speak") pane3 = EntryPane(okButton) dlg = EntryDialog(pane1, pane2, pane3) dlg.setTitle("Synthetic Voice") initTTS() while not dlg.isDisposed(): if speaker1.isTouched(): textEntry.setValue("Viel Spass am Programmieren") elif speaker2.isTouched(): textEntry.setValue("Enjoy programming") elif speaker3.isTouched(): textEntry.setValue("Profitez de la programmation") elif speaker4.isTouched(): textEntry.setValue("Godere della programmazione") if okButton.isTouched(): if speaker1.getValue(): selectVoice("german-man") text = textEntry.getValue() elif speaker2.getValue(): selectVoice("english-man") text = textEntry.getValue() elif speaker3.getValue(): selectVoice("french-man") text = textEntry.getValue() elif speaker4.getValue(): selectVoice("italian-woman") text = textEntry.getValue() if text != "": voice = generateVoice(text) openSoundPlayer(voice) play()
|
MEMO |
The while loop executes until the dialog is closed with the title bar's close button. You check with isTouched() in every cycle, if the confirmation button was clicked since the last call of this function. In this case you get the current values of the GUI elements by calling getValue()and transform the text in the text field to a voice like in the preceding examples. It is a bit dangerous to go through such "narrow" loops, because you waste lot of processing time for nothing other than just a check whether the button was pressed. However, when you call isTouched() the program will automatically stop for a short time (1ms) so that the throughput is slightly slowed down. |
EXERCISES |
|