pdfreader

Read text and parse tables from PDF files. Supports tabular data with automatic column detection, and rule-based parsing.

officeparser

A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx, odt, odp, ods, pdf files.

pdf-data-parser

Parse, search and stream PDF tabular data using Node.js with Mozilla's PDF.js library.