node-ts-ocr

A simple wrapper around command-line utils to assist in PDF / Image OCR (Optical Character Recognition) processing using Tesseract.

read-pdf2llm

High-performance PDF text extractor (with OCR fallback) for Node.js, optimized for LLM pipelines. Uses PDFium, Tesseract, and C++ addon.