Transcription & Document Parsing, Simplified.

A powerful FastAPI service to handle speech-to-text, document parsing, and OCR all in one place. Upload any supported file and get the extracted text back instantly.

Get Started

Core Features

Everything you need to process content from any source.

🎙️

Audio/Video Transcription

High-quality speech-to-text using the blazing-fast Faster-Whisper engine.

📑

Document Parsing

Extracts text from `.pdf`, `.docx`, `.xlsx`, `.csv`, and `.txt` files seamlessly.

🖼️

Image OCR

Reads text from scanned images like `.png` or `.jpg` using Tesseract.

Supported File Types

A wide range of formats are supported out-of-the-box.

Category	Formats
Audio / Video	`.mp3`, `.wav`, `.mp4`, `.m4a`
Documents	`.pdf`, `.docx`, `.xlsx`, `.csv`, `.txt`
Images (OCR)	`.png`, `.jpg`, `.jpeg`

Deploy Your Instance

Get the API running in minutes with Docker. Choose your preferred method below.

Instructions

Follow these steps to build and run the Docker container on your local machine. This is ideal for development and testing.

# 1. Clone the repository
git clone https://github.com/shahzaibtkturners/transcription-api.git
cd transcription-api

# 2. Build the Docker image
docker build -t transcription-api .

# 3. Run the container
docker run -d -p 8000:8000 --name transcription-service transcription-api

Docker Compose / Portainer

Use this `docker-compose.yml` configuration to deploy with Docker Compose or as a stack in Portainer. It builds directly from GitHub.

version: '3.8'

services:
  transcription-api:
    build:
      context: https://github.com/shahzaibtkturners/transcription-api.git
      dockerfile: Dockerfile
    container_name: transcription-api
    ports:
      - "8000:8000"
    restart: unless-stopped

API Usage

Once running, send a POST request to the `/upload/` endpoint.

Example Request

Use `curl` or any HTTP client to send a `multipart/form-data` request with your file.

curl -X POST "http://localhost:8000/upload/" \
  -H "accept: application/json" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/path/to/your/sample.pdf"

Example Response

You'll receive a JSON response containing the extracted text and file metadata.

{
  "filename": "sample.pdf",
  "content_type": "application/pdf",
  "text": "This is the extracted text content..."
}

The interactive API documentation is also available at /docs on your server.

Contribute & Support

This project is open-source. We welcome contributions and support from the community!

Contribute on GitHub