TTS-API

Overview

This project provides a FastAPI-based backend for performing the following tasks:

OCR (Optical Character Recognition): Extract Arabic text from images.
Text-to-Speech (TTS): Convert text to speech and return an audio file.

Endpoints

1. `/ocr` - Extract Arabic Text from Images

Method: POST
Description: Upload an image to extract Arabic text using OCR.
Request:
- file (form-data): The image file.
- lang (form-data, optional): Language code (default: ara).
Response:
- Extracted text in JSON format.

3. `/tts` - Convert Text to Speech

Method: POST
Description: Convert text to speech and return an audio file.
Request:
- text (form-data): Text to convert to speech.
- voice (form-data, optional): Voice name (default: Aisha).
Response:
- Audio file in MP3 format.

How to Use with Frontend

OCR:
- Use a file input to upload an image.
- Send a POST request to /ocr with the image file and optional language code.
- Display the extracted text from the response.
Text-to-Speech:
- Send a POST request to /tts with the text and optional language code.
- Play or download the returned MP3 file.

Setup

Install dependencies:
```
pip install -r requirements.txt
```
Run the FastAPI server:
```
uvicorn main:app --reload
```
Access the API documentation at http://localhost:8000/docs to test the endpoints interactively.

Prerequisites

Install CUDA

Ensure that CUDA is installed on your computer to enable GPU acceleration for supported libraries.

Download CUDA from the NVIDIA CUDA Toolkit website.
Follow the installation instructions for your operating system.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
utils		utils
.env		.env
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TTS-API

Overview

Endpoints

1. `/ocr` - Extract Arabic Text from Images

3. `/tts` - Convert Text to Speech

How to Use with Frontend

Setup

Prerequisites

Install CUDA

About

Uh oh!

Releases

Packages

Languages

TheKnower0x0/TTS-API

Folders and files

Latest commit

History

Repository files navigation

TTS-API

Overview

Endpoints

1. /ocr - Extract Arabic Text from Images

3. /tts - Convert Text to Speech

How to Use with Frontend

Setup

Prerequisites

Install CUDA

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `/ocr` - Extract Arabic Text from Images

3. `/tts` - Convert Text to Speech

Packages