unicode-byte-truncate
Unicode aware string truncation that given a max byte size will truncate the string to or just below that size
code-points
Get the code points of each character in the string
multichar-regex
a regular expression that matches all the surrogate pairs and combining-marked characters in a string