tika

Apache Tika bridge. Text extraction, metadata extraction, mimetype detection and language detection.

@conscia/tika

Apache Tika bridge. Text extraction, metadata extraction, mimetype detection and language detection.

textract

Extracting text from files of various type including html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf, text/*, and various open office.