wink-tokenizer
Multilingual tokenizer that automatically tags each token with its type
kingchop
A text tokenizing library that handles strings, by tokenizing them into arrays, depending on intended format, like sentences, sub sentences, paragraphs, words...