unocr
Unified OCR library with multi-driver support for Tesseract.js and AI models, providing structured text extraction using hast-based output format
hocr-dom
Extend HTMLDocument and Element with hOCR query and property helpers
mimeograph
CoffeeScript lib for PDF OCR and text extraction.