[archived content]
Dfl /

So you want to load a file and display it on your window, but the file's encoding isn't always exactly what D expects? Here's a way to load 8-bit text into a DFL control, where the text is either ASCII, UTF-8, UTF-8 with BOM, or ANSI.

import dfl.all, dfl.internal.utf;
import std.file, std.utf, std.stream;

char[] from8Bit(void[] data)
{
        ubyte[] utf8bom = ByteOrderMarks[BOM.UTF8];
        if(data.length >= utf8bom.length
        && cast(ubyte[])data[0 .. utf8bom.length] == utf8bom)
        {
                // It has UTF-8 BOM, so must be valid UTF-8.
                return cast(char[])data[utf8bom.length .. data.length];
        }
        char[] str;
        str = cast(char[])data;
        try
        {
                // Check if valid UTF-8 or ASCII.
                std.utf.validate(str);
        }
        catch
        {
                // Fall back to ANSI.
                str = dfl.internal.utf.fromAnsi(str.ptr, str.length);
        }
        return str;
}

Load the file into your program like so:

   textBox1.text = from8Bit(std.file.read("foo.txt"));

and you shouldn't get any invalid UTF-8 errors unless it is in fact UTF-8 which is invalid.

Note that when it falls back to ANSI, it has the problems of ANSI where it might be using the wrong codepage. In this case, I don't think Notepad would do any better either.

Page last modified on March 04, 2007, at 09:23 PM