pdf-parse
Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs. Run directly in your browser or in Node!
officeparser
A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx, odt, odp, ods, pdf files.
pdfreader
Read text and parse tables from PDF files. Supports tabular data with automatic column detection, and rule-based parsing.
pdf-data-parser
Parse, search and stream PDF tabular data using Node.js with Mozilla's PDF.js library.