pdf3json
A PDF file parser that converts PDF binaries to text based JSON, powered by porting a fork of PDF.JS to Node.js
n8n-nodes-pdf-parse
N8N community node for parsing PDF files to text with advanced configuration options
officeparser
A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx, odt, odp, ods, pdf files.
file-type
Detect the file type of a file, stream, or data