In this article, you’ll learn how to create an audiobook from PDF using python
Install Necessary Modules:
pip install PyPDF2 pip install pyttsx3
PyPDF2 is a python library built as a PDF toolkit. It is capable of extracting document information splitting documents page by page merging documents page by page cropping pages merging multiple pages into a single page encrypting and decrypting PDF files and more!
pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline and is compatible with both Python 2 and 3.
Once Installed now we can import it inside our python code.
''' Python Program to Create AudioBook from PDF ''' # Import the necessary modules! import pyttsx3 import PyPDF2 # Open our file in reading format and store into book book = open('demo.pdf','rb') # `rb` stands for reading mode # Call PyPDF2's PdfFileReader method on book and store it into pdf_reader pdf_reader = PyPDF2.PdfFileReader(book) # Calculate the no of pages in our pdf by using numPages method num_pages = pdf_reader.numPages # Initialize pyttsx3 using init method and let's print playing Audiobook play = pyttsx3.init() print('Playing Audio Book') # Run a loop for the number of pages in our pdf file. # A page will get retrieved at each iteration. for num in range(0, num_pages): page = pdf_reader.getPage(num) # Extract the text from our page using extractText method on our page and store it into data. data = page.extractText() # Call say method on data and finally we can call runAndWait method at the end. play.say(data) play.runAndWait()
Playing Audio Book