unstd.utf - Unstandard documentation

License: Boost License 1.0.
Authors: Denis Shelomovskij

pure nothrow @safe bool isContinuationByte(in char c);

Detect whether c is a UTF-8 continuation byte.

pure nothrow @safe bool isLeadSurrogate(in wchar c);
pure nothrow @safe bool isTrailSurrogate(in wchar c);
pure nothrow @safe bool isValidBMPCharacter(in wchar c);

Detect whether c is a UTF-16 lead/trail surrogate or not a surrogate.

pure nothrow bool isSequenceStart(C)(in C c) if (isSomeChar!C);

Detect whether c is the first code unit in a sequence.

pure nothrow size_t adjustBack(C)(in C[] str, size_t idx) if (isSomeChar!C);
pure nothrow size_t adjustForward(C)(in C[] str, size_t idx);

Adjust idx to point at the start of a UTF sequence or at the end of str.

pure nothrow size_t minLength(To, From)(in size_t length) if (isSomeChar!To && isSomeChar!From);
pure nothrow size_t minLength(To, From)(in From[] str);
pure nothrow size_t maxLength(To, From)(in size_t length) if (isSomeChar!To && isSomeChar!From);
pure nothrow size_t maxLength(To, From)(in From[] str);

Returns minimum/maximum possible length of string conversion to another Unicode Transformation Format result.

Examples

import std.range;
import std.utf;

const str = "abc-ЭЮЯ";
const wlen = toUTF16(str).length;
const dlen = walkLength(str);
assert(wlen >= minLength!wchar(str) && wlen <= maxLength!wchar(str));
assert(dlen >= minLength!dchar(str) && dlen <= maxLength!dchar(str));

@trusted To[] copyEncoded(To, From)(in From[] source, To[] buff) if (isSomeChar!To && isSomeChar!From);

Copies text from source to buff performing conversion to different unicode transformation format if needed.

buff must be large enough to hold the result.

Preconditions:

buff.length >= minLength!To(source)

Returns

Slice of the provided buffer buff with the copy of source.

Examples

const str = "abc-ЭЮЯ";
wchar[100] wsbuff;
assert(copyEncoded(str, wsbuff) == "abc-ЭЮЯ"w);

pure @trusted To[] copySomeEncoded(To, From)(ref inout(From)[] source, To[] buff) if (isSomeChar!To && isSomeChar!From);

Copies as much text from the beginning of source to buff as latter can hold performing conversion to different unicode transformation format if needed.

source will be set to its uncopied slice.

Returns

Slice of the provided buffer buff with a (parital) copy of source.

Examples

import std.array: empty;

const(char)[] buff = ...;
wchar[n] wbuff = void;
while(!buff.empty)
	f(buff.copySomeEncoded(wbuff)); // `f` accepts at most `n` wide characters

pure nothrow @safe bool isContinuationByte(in char c);

pure nothrow @safe bool isLeadSurrogate(in wchar c); pure nothrow @safe bool isTrailSurrogate(in wchar c); pure nothrow @safe bool isValidBMPCharacter(in wchar c);

pure nothrow bool isSequenceStart(C)(in C c) if (isSomeChar!C);

pure nothrow size_t adjustBack(C)(in C[] str, size_t idx) if (isSomeChar!C); pure nothrow size_t adjustForward(C)(in C[] str, size_t idx);

@trusted To[] copyEncoded(To, From)(in From[] source, To[] buff) if (isSomeChar!To && isSomeChar!From);

pure @trusted To[] copySomeEncoded(To, From)(ref inout(From)[] source, To[] buff) if (isSomeChar!To && isSomeChar!From);

pure nothrow @safe bool isLeadSurrogate(in wchar c);
pure nothrow @safe bool isTrailSurrogate(in wchar c);
pure nothrow @safe bool isValidBMPCharacter(in wchar c);

pure nothrow size_t adjustBack(C)(in C[] str, size_t idx) if (isSomeChar!C);
pure nothrow size_t adjustForward(C)(in C[] str, size_t idx);