Engineering ProjectsBsc-ITDiplomaIT ProjectsMsc-IT Projects

Online PDF to Text and Audio Translator with Python Project

Introduction

In a world brimming with digital content, the need for versatile reading and translation tools has never been greater. The Online PDF to Text and Language Translator is a Python-powered tool designed to convert and translate PDF documents into text and audio formats, catering to a wide array of linguistic needs, including English, Hindi, Marathi, and more.

How It Works

  • PDF to Text Conversion: Utilizing the powerful PyPDF2 library, this tool reads and converts the PDF file content into editable text.
  • Text to Audio Translation: With the integration of Google’s Text-to-Speech (gTTS) and googletrans libraries, it then translates and converts the text into the desired language’s audio format.

Key Features

  • Multilingual Support: Offers translation and vocalization in several languages including English, Hindi, Marathi, Gujarati, and more.
  • Accessibility: Enhances reading experience for visually impaired individuals and those with learning disabilities.
  • User-Friendly Interface: Simple login and upload system, making it easy for anyone to convert and translate PDFs.

Advantages

  • Efficiency: Quickly converts and translates large volumes of text, saving time and effort.
  • Inclusivity: Supports multiple languages, making it accessible to a global audience.
  • Versatility: Useful for various purposes, from educational learning to leisure reading.

Limitations

  • Accuracy: The quality of the output might vary with the complexity of the PDF’s language and format.
  • Dependency: Relies on the accuracy of the PDF content and the effectiveness of translation APIs.

Conclusion

The Online PDF to Text and Language Translator is more than just a tool; it’s a bridge connecting different languages and cultures. Whether for personal, educational, or professional use, it offers an efficient way to convert and understand content across the globe, breaking down language barriers one PDF at a time.

Sample Code

Setup and Requirements:

  • Python: Programming language.
  • Flask: Python web framework.
  • PyPDF2: Library for PDF manipulation.
  • googletrans: Library for translation.
  • gTTS (Google Text-to-Speech): A Python library and CLI tool to interface with Google Translate’s text-to-speech API.
pip install Flask PyPDF2 googletrans==4.0.0-rc1 gTTS

Flask Application (app.py):

from flask import Flask, request, render_template, send_file
from PyPDF2 import PdfReader
from googletrans import Translator
from gtts import gTTS
import io

app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'POST':
        # Check if a file is provided
        if 'file' not in request.files:
            return 'No file part'
        file = request.files['file']
        # If the user does not select a file, the browser submits an empty file without a filename.
        if file.filename == '':
            return 'No selected file'
        if file:
            # Read PDF content
            reader = PdfReader(file)
            text = ''
            for page in reader.pages:
                text += page.extract_text()
            
            # Translate text
            target_language = request.form.get('language')  # e.g., 'hi' for Hindi
            translator = Translator()
            translated = translator.translate(text, dest=target_language)

            # Convert to audio
            tts = gTTS(translated.text, lang=target_language)
            mp3_fp = io.BytesIO()
            tts.write_to_fp(mp3_fp)
            mp3_fp.seek(0)
            
            return send_file(
                     mp3_fp,
                     as_attachment=True,
                     attachment_filename='translated_audio.mp3',
                     mimetype='audio/mp3'
                   )

    return render_template('index.html')

if __name__ == '__main__':
    app.run(debug=True)

HTML Template (index.html):

<!doctype html>
<html>
<head>
    <title>PDF to Text and Language Translator</title>
</head>
<body>
    <h2>Upload PDF to Convert and Translate</h2>
    <form method="post" action="/" enctype="multipart/form-data">
        <input type="file" name="file">
        <select name="language">
            <option value="en">English</option>
            <option value="hi">Hindi</option>
            <option value="mr">Marathi</option>
            <!-- Add other languages as needed -->
        </select>
        <input type="submit" value="Upload">
    </form>
</body>
</html>
Click to rate this post!
[Total: 0 Average: 0]

Download Online PDF to Text and Audio Translator with Python Project PDF


Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button