I've been mulling over this problem that sometimes the host language might provide UTF-16 strings (Java or Javascript) or UTF-8 strings (C) but you'd sometimes want to write code that runs efficiently in both environments.
I think that since ASCII is a subset of both and many (most?) operations are only interested in the ASCII characters of the string, providing an API that handles ASCII specially could be implemented efficiently in both languages. For example:
Code that does support or care about unicode will inevitably want APIs that operate on code points, which requires (inefficient) translation regardless of whether the underlying string is UTF-8 or UTF-16.
I think the key thing here is to provide operations for as many of the things people want to do while avoiding having them think about or depend on the underlying encoding. Operations like length() and indexOf() are troublesome because they are ambiguous about whether code points or they correspond to (8 or 16 bit) code units or (32 bit) code points. So providing useful operations that mostly eliminate the need to ask about the length is good. Like, we never want people doing a loop using a length and index...