Unstandard

Additions to std.utf.

License
Boost License 1.0.
Authors
Denis Shelomovskij

pure nothrow @safe bool  isContinuationByte(in char c);

Detect whether c is a UTF-8 continuation byte.


pure nothrow @safe bool  isLeadSurrogate(in wchar c);
pure nothrow @safe bool  isTrailSurrogate(in wchar c);
pure nothrow @safe bool  isValidBMPCharacter(in wchar c);

Detect whether c is a UTF-16 lead/trail surrogate or not a surrogate.


pure nothrow bool  isSequenceStart(C)(in C c) if (isSomeChar!C);

Detect whether c is the first code unit in a sequence.


pure nothrow size_t  adjustBack(C)(in C[] str, size_t idx) if (isSomeChar!C);
pure nothrow size_t  adjustForward(C)(in C[] str, size_t idx);

Adjust idx to point at the start of a UTF sequence or at the end of str.


pure nothrow size_t  minLength(To, From)(in size_t length) if (isSomeChar!To && isSomeChar!From);
pure nothrow size_t  minLength(To, From)(in From[] str);
pure nothrow size_t  maxLength(To, From)(in size_t length) if (isSomeChar!To && isSomeChar!From);
pure nothrow size_t  maxLength(To, From)(in From[] str);

Returns minimum/maximum possible length of string conversion to another Unicode Transformation Format result.

Examples
import std.range;
import std.utf;

const str = "abc-ЭЮЯ";
const wlen = toUTF16(str).length;
const dlen = walkLength(str);
assert(wlen >= minLength!wchar(str) && wlen <= maxLength!wchar(str));
assert(dlen >= minLength!dchar(str) && dlen <= maxLength!dchar(str));

@trusted To[]  copyEncoded(To, From)(in From[] source, To[] buff) if (isSomeChar!To && isSomeChar!From);

Copies text from source to buff performing conversion to different unicode transformation format if needed.

buff must be large enough to hold the result.

Preconditions:
buff.length >= minLength!To(source)
Returns
Slice of the provided buffer buff with the copy of source.
Examples
const str = "abc-ЭЮЯ";
wchar[100] wsbuff;
assert(copyEncoded(str, wsbuff) == "abc-ЭЮЯ"w);

pure @trusted To[]  copySomeEncoded(To, From)(ref inout(From)[] source, To[] buff) if (isSomeChar!To && isSomeChar!From);

Copies as much text from the beginning of source to buff as latter can hold performing conversion to different unicode transformation format if needed.

source will be set to its uncopied slice.

Returns
Slice of the provided buffer buff with a (parital) copy of source.
Examples
import std.array: empty;

const(char)[] buff = ...;
wchar[n] wbuff = void;
while(!buff.empty)
	f(buff.copySomeEncoded(wbuff)); // `f` accepts at most `n` wide characters