Python to Create AudioBook from PDF with Source Code

In this article, you’ll learn how to create an audiobook from PDF using python

Prerequisites:

  1. Python Basics
  2. PyPDF2 module
  3. pyttsx3 module

Install Necessary Modules:

Open your  Prompt  and type and run the following command (individually):

pip install PyPDF2
pip install pyttsx3

PyPDF2 is a python library built as a PDF toolkit. It is capable of extracting document information splitting documents page by page merging documents page by page cropping pages merging multiple pages into a single page encrypting and decrypting PDF files and more!

pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline and is compatible with both Python 2 and 3.

Once Installed now we can import it inside our python code.

Source Code:

'''
Python Program to Create AudioBook from PDF
'''

# Import the necessary modules!
import pyttsx3
import PyPDF2

# Open our file in reading format and store into book
book = open('demo.pdf','rb')    # `rb` stands for reading mode

# Call PyPDF2's PdfFileReader method on book and store it into pdf_reader
pdf_reader = PyPDF2.PdfFileReader(book)

# Calculate the no of pages in our pdf by using numPages method
num_pages = pdf_reader.numPages

# Initialize pyttsx3 using init method and let's print playing Audiobook
play = pyttsx3.init()
print('Playing Audio Book')

# Run a loop for the number of pages in our pdf file. 
# A page will get retrieved at each iteration.
for num in range(0, num_pages):
    page = pdf_reader.getPage(num)
    # Extract the text from our page using extractText method on our page and store it into data.
    data = page.extractText()


# Call say method on data and finally we can call runAndWait method at the end.    
    play.say(data)
    play.runAndWait()

Output:

Playing Audio Book

Leave a Comment