Create text to speech application using python

Home » Technology » Machine Learning » Create text to speech application using python

If you are looking for a free text-to-speech converter, then you can create one for yourself. I will teach you how to build it. We will build the application using flask, python, and gTTS. Flask is the python micro web services framework. gTTS is the Google Text-to-Speech python library and CLI tool to interface with Google Translate text-to-speech API. The format of the audio file will be in an mp3 audio file.

Prerequisite and Setup

Let’s set up the system for creating the text-to-speech converter application. As stated above the python library that we are going to use is gTTS. The web framework used is called the flask. We will make use of the python virtual environment to create the application. You can check out my article on python’s virtual environment if you don’t know how to set it up. The command to install the required modules are given below. The file structure of the project can be seen on my Github from here.

$mkdir text-audio 
$cd text-audio 
text-audio $ virtualenv text-audio-venv 
text-audio $ source text-audio-venv/bin/activate
(text-audio-venv) text-audio $ pip3 install flask 
(text-audio-venv) text-audio $ pip3 install gTTS

Text to speech application

We will take the input from the web interface. The user can type the text. They can copy-paste the text as well. Then the user has to select the language that they want the audio file in. Note that you have to write the text in the language that you have selected. Finally, we will process the text using gTTS and convert it into an audio file.

Source code of flask main file

The source code for the app.py flask file is given below.

from flask import Flask, render_template, request
from gtts import gTTS

app = Flask(__name__)

@app.route("/", methods=["GET", "POST"])
def index():
	status = 0
	if request.method == "POST":
		if request.form.get('mp3') == "Convert":
			text_data = request.form.get('text')
			language = request.form.get('lang')
			audio = gTTS(text_data, lang=language)
			audio.save("static/your_audio.mp3")
			status = 1
			return render_template("index.html", status=status)
	return render_template("index.html", status=status)

if "_name__" == "__main__":
	app.run(debug=True)

 

Source code for user interface

The source code for the user interface index.html file is given below.

<!DOCTYPE html>
<html>
<head>
	<meta charset="utf-8">
	<meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Text to Speech Converter</title>
</head>
<body>
	<div style="background-color: yellow; position: absolute; top: 50%; left: 50%; transform: translate(-50%, -50%); padding: 20px; box-shadow: 10px 10px grey; text-align: center;">
		<h1 style="">Python Text to Speech(mp3) Converter</h1>
		<hr>
		<style type="text/css">
			.button{
				transition: 0.5s;
				padding: 10px;
				font-size: 15px;
				color: white;
				background-color: deepskyblue;
			}
			.button:hover{
				background-color: green;
				font-size: 15px;
				color: white;
			}
		</style>
		<form action="/" method="post">
			<label> Enter the text</label><br>
			<textarea name = "text" rows="10" cols="70" placeholder="Enter the text you want to convert to audio" required></textarea><br><br>
			<label>Select the Language</label>
			<select name="lang">
				<option value="en">English</option>
				<option value="es">Spanish</option>
				<option value="hi">Hindi</option>
				<option value="fr">French</option>
				<option value="ne">Nepali</option>
				<option value="sv">Swedish</option>
				<option value="th">Thai</option>
				<option value="zh-CN">Chinese</option>
				<option value="ko">Korean</option>
				<option value="de">German</option>
				<option value="id">Indonesian</option>
				<option value="it">Italian</option>
				<option value="ja">Japanese</option>
				<option value="ru">Russian</option>

			</select><br><br>
			<input class="button" type="submit" value="Convert" name="mp3"><br><br>
		</form>
		{% if status != 0 %}
			<table>
				<tr>
					<th>
						<audio controls>
							<source src="static/your_audio.mp3" type="audio/mp3">
						</audio>					
					</th>
					<th>
						<a href="static/your_audio.mp3" download><button class="button">Download</button></a>	
					</th>
					<th>
						<a href="{{url_for('index')}}"><button class="button">Clear Result</button></a>
					</th>
				</tr>
			</table>	
		{% endif %}		
	</div>
</body>
</html>

The user interface looks like the picture given below. You have a text area to type your text. Then you have the option to select the language. I have only included a few language options. You can add more as well. You can check the supported language using the “gtts.lang.tts_langs()” command in the python terminal after importing the “gtts” library.  Then you can click on the convert button to generate the audio file. The generated audio file will be in mp3 format.

text to speech converter interface

Once you click on the convert button, you will get another interface. Where you can play and listen to the audio file, download the audio file or clear the audio file to create a new one. The picture of the user interface is given below.

text to speech mp3 converter

Live Demo

You can access the live demo of the above application. Click here.

Conclusion

Finally, you have learned how to create the free text to speech converter application using python, flask, and gTTS. The application has one drawback, you can not change the voice. By default, it has a female voice. But it does support a localized accent. You can check its documentation for more information. If you like the article do share it with your friends.