multichar-regex
a regular expression that matches all the surrogate pairs and combining-marked characters in a string
unicode-byte-truncate
Unicode aware string truncation that given a max byte size will truncate the string to or just below that size
code-point
Get a UTF-16-encoded code point number of a character