ae.utils.text

Utility code related to string and text processing.

Modules

ascii
module ae.utils.text.ascii

Simple (ASCII-only) text-processing functions, for speed and CTFE.

html
module ae.utils.text.html

ae.utils.text.html

parsefp
module ae.utils.text.parsefp

Pure D code to parse floating-point values. Adapted to nothrow/@nogc from std.conv.

Public Imports

ae.utils.text.ascii
public import ae.utils.text.ascii : ascii, DecimalSize, toDec, toDecFixed, asciiToLower, asciiToUpper;
ae.utils.array
public import ae.utils.array : contains;

Members

Aliases

CIAsciiString
alias CIAsciiString = NormalizedArray!(immutable(char), s => s.byCodeUnit.map!(std.ascii.toLower))

Case-insensitive ASCII string.

CIUniString
alias CIUniString = NormalizedArray!(immutable(char), s => s.map!(toLower))

Case-insensitive Unicode string.

doubleToString
alias doubleToString = fpToString!double
Undocumented in source.
toLowerHex
alias toLowerHex = toHex!lowerHexDigits
Undocumented in source.

Enums

fpCFormatString
eponymoustemplate fpCFormatString(T)

C format string to exactly format a floating-point type T.

fpFormatString
eponymoustemplate fpFormatString(T)

Format string for a FP type which includes all necessary significant digits

significantDigits
eponymoustemplate significantDigits(T : real)

How many significant decimal digits does a FP type have (determined empirically - valid for all D FP types on x86/64)

Functions

UTF8ToRaw
ascii UTF8ToRaw(char[] r)

Undo rawToUTF8.

arrayFromHex
ubyte[] arrayFromHex(char[] hex)

Parses hex into an array of bytes. hex.length should be even.

arrayFromHex
void arrayFromHex(char[] hex, ubyte[] buf)

Parses hex into the given array buf.

asText
inout(char)[] asText(inout(ubyte)[] bytes)

Like readText, but with in-memory data. Reverse of ae.utils.array.bytes (for strings).

asciiSplit
T[][] asciiSplit(T[] text)

Like std.string.split (one argument version, which splits by whitespace), but only splits by ASCII and does not autodecode.

asciiStrip
T[] asciiStrip(T[] s)

Like strip, but only removes ASCII whitespace.

camelCaseJoin
string camelCaseJoin(string[] arr)

Join an array of words into a camel-cased string.

eatLine
T[] eatLine(T[] s, bool eatIncompleteLines)

Consume a LF or CRLF terminated line from s. Sets s to null and returns the remainder if there is no line terminator in s.

fastReplace
T[] fastReplace(T[] what, T[] from, T[] to)

An implementation of replace optimized for common cases (short strings).

fastSplit
T[][] fastSplit(T[] s, U d)

An implementation of split optimized for common cases. Allocates only once.

findBestMatch
sizediff_t findBestMatch(string[] items, string target, float threshold)

Select best match from a list of items. Returns -1 if none are above the threshold.

forceValidUTF8
string forceValidUTF8(ascii s)

Lossily convert arbitrary data into a valid UTF-8 string.

formatAs
string formatAs(T obj, string fmt)

UFCS helper

formatted
auto formatted(T values)

Lazily formatted object

fpAsString
FPAsString!T fpAsString(T f)
Undocumented in source. Be warned that the author may not have intended to support it.
fromHex
T fromHex(const(C)[] s)

Parses s as a hexadecimal number into an integer of type T.

fromZArray
C[] fromZArray(C[n] arr)
C[] fromZArray(C[] arr)

Return the slice up to the first NUL character, or of the whole array if none is found.

hexDump
string hexDump(const(void)[] b)

Formats binary data as a hex dump (three-column layout consisting of hex offset, byte values in hex, and printable low-ASCII characters).

newlinesToSpaces
T[] newlinesToSpaces(T[] s)

Replaces runs of ASCII whitespace which contain a newline ('\n') into a single space.

normalizeWhitespace
ascii normalizeWhitespace(ascii s)

Replaces all runs of ASCII whitespace with a single space.

nullStringTransform
string nullStringTransform(char[] s)

Where a delegate with this signature is required.

numberToString
string numberToString(T v)

Get shortest string representation of a numeric type that still converts to exactly the same number.

parseHexDigit
ubyte parseHexDigit(char c)

Parse a single hexadecimal digit according to the policy in config.

putFP
void putFP(Writer writer, F v)

Like fpToString, but writes the result to a sink.

randomString
string randomString(int length, string chars)

Generate a random string with the given parameters. std.random is used as the source of randomness. Not cryptographically secure.

rawToUTF8
string rawToUTF8(char[] s)

Convert any data to a valid UTF-8 bytestream, so D's string functions can properly work on it.

sarrayFromHex
void sarrayFromHex(Hex hex, ubyte[N] buf)

Parses hex into the given array buf. Fast version for static arrays of known length.

segmentByWhitespace
T[][] segmentByWhitespace(T[] s)

Covering slice-list of s with interleaved whitespace.

selectBestFrom
string selectBestFrom(string[] items, string target, float threshold)

Select best match from a list of items. Returns null if none are above the threshold.

splitAsciiLines
T[][] splitAsciiLines(T[] text)

Like splitLines, but does not attempt to split on Unicode line endings. Only splits on "\r", "\n", and "\r\n".

splitByCamelCase
string[] splitByCamelCase(string s)

Splits out words from a camel-cased string. All-uppercase words are returned as a single word.

stringDistance
int stringDistance(string s, string t)

Simpler implementation of Levenshtein string distance

stringSimilarity
float stringSimilarity(string string1, string string2)

Return a number between 0.0 and 1.0 indicating how similar two strings are (1.0 if identical)

toHex
void toHex(T n, char[U] buf)
char[T.sizeof * 2] toHex(T n)

Conversion an integer type to a fixed-length hexadecimal string.

verbatimWrap
string verbatimWrap(string s, size_t columns, string firstIndent, string indent, size_t tabWidth)

Like std.string.wrap, but preserves whitespace at line start and between (non-wrapped) words.

Structs

FPAsString
struct FPAsString(T)

Wraps the result of fpToString in a non-allocating stringifiable struct.

HexParseConfig
struct HexParseConfig

Policy for parseHexDigit.

Templates

eatLine
deprecated template eatLine(OnEof onEof)
Undocumented in source.
fpToString
template fpToString(F)

Get shortest string representation of a FP type that still converts to exactly the same number.

toHex
template toHex(alias digits = hexDigits)

Conversion from bytes to hexadecimal strings.

Meta

License

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.

Authors

Vladimir Panteleev <ae@cy.md>