Skip to content

kathanjain/MultiPageTablePDFExraction

Repository files navigation

MultiPage Table PDF Extraction

This project provides an Azure Function App for extracting table data from multi-page PDF files. It is designed for easy deployment, testing, and extension.

Features

  • Extracts tables from multi-page PDFs
  • REST API endpoint for PDF upload and extraction
  • Includes test files and client example

Folder Structure

  • ExtractPDFDetails/ - Azure Function code
  • test/ - Sample PDF files and test client
  • requirements.txt - Python dependencies
  • local.settings.json - Local development settings (not included in repo)

Getting Started

Prerequisites

  • Python 3.11+
  • Azure Functions Core Tools
  • Node.js (for local Azure Functions)
  • [Optional] VS Code with Azure Functions extension

Setup

  1. Clone the repository:
    git clone https://github.com/kathanjain/MultiPageTablePDFExraction.git
    cd MultiPageTablePDFExraction
  2. Create a virtual environment and install dependencies:
    python -m venv .venv
    .venv\Scripts\activate  # On Windows
    pip install -r requirements.txt
  3. Run the Azure Function locally:
    func start
  4. Test using the provided client:
    cd test
    python test_client.py

Contributing

Pull requests are welcome! Please open an issue first to discuss major changes.

License

MIT License. See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages