This project provides an Azure Function App for extracting table data from multi-page PDF files. It is designed for easy deployment, testing, and extension.
- Extracts tables from multi-page PDFs
- REST API endpoint for PDF upload and extraction
- Includes test files and client example
ExtractPDFDetails/- Azure Function codetest/- Sample PDF files and test clientrequirements.txt- Python dependencieslocal.settings.json- Local development settings (not included in repo)
- Python 3.11+
- Azure Functions Core Tools
- Node.js (for local Azure Functions)
- [Optional] VS Code with Azure Functions extension
- Clone the repository:
git clone https://github.com/kathanjain/MultiPageTablePDFExraction.git cd MultiPageTablePDFExraction - Create a virtual environment and install dependencies:
python -m venv .venv .venv\Scripts\activate # On Windows pip install -r requirements.txt
- Run the Azure Function locally:
func start
- Test using the provided client:
cd test python test_client.py
Pull requests are welcome! Please open an issue first to discuss major changes.
MIT License. See LICENSE for details.