top of page
  • Writer's pictureThe Arithmancer

Voice Synthesis in Python

Updated: Aug 17, 2019

In this blog, we will write some code to do speech synthesis using Google Text-To-Speech. We will then wrap our code in a function so that in the end, we can call it from our Magic Mirror Code.


gTTS (Google Text-to-Speech) is a Python library to interface with Google Translate's text-to-speech API. It writes spoken mp3 data to a file which we can then play back.


Installing the gTTS Library on the Raspberry Pi

$ sudo pip3 install gTTS    

Python Function:

 
import os      # Needed for version 2 of our function  
import pygame  # Needed for version 1 of our function  
from gtts import gTTS 
pygame.mixer.init()
 
def Speak(toSay, mylang)    # Version 1 uses pygame library for playback     
    tts.gTTS(toSay, mylang)  
    tts.save("/tmp/temp.mp3") 
    pygame.mixer.music.set_volume(1.0) 
    pygame.mixer.music.load("/tmp/temp.mp3") 
    pygame.mixer.music.play()
def SpeakV2(toSay, mylang)    # Version 2 uses a system app for playback
    tts.gTTS(toSay, mylang) 
    tts.save("/tmp/temp.mp3") 
    os.system("omxplayer /tmp/temp.mp3 &")      # & to run in background
 

Python Function Call:

Speak("Mirror Mirror on the Wall", "en")

The first version of the function uses the pygame library. While this gives more control over the playback, the maximum volume I get out of this is not as loud as version 2. For that reason, I have included version 2 which uses omxplayer which is a media player pre-installed on the raspberry pi. While it is louder, the mp3 files sometimes are truncated!


The function we wrote takes two arguments - "toSay" - which is the sentence we would like to translate into voice, and "mylang" which is the language in which to pronounce that sentence. "en" is of course, English but many other languages are also supported.


108 views0 comments

Recent Posts

See All

Comments


bottom of page