|
| Input (const Input &input) |
| Copy constructor (with intended "move semantics" as internal state is shared, should not rely on using the rhs after copying). More...
|
|
| Input (void) |
| Construct empty input character sequence. More...
|
|
| Input (const char *cstring) |
| Construct input character sequence from a NUL-terminated string. More...
|
|
| Input (const std::string &string) |
| Construct input character sequence from a std::string. More...
|
|
| Input (const std::string *string) |
| Construct input character sequence from a pointer to a std::string. More...
|
|
| Input (const wchar_t *wstring) |
| Construct input character sequence from a NUL-terminated wide character string. More...
|
|
| Input (const std::wstring &wstring) |
| Construct input character sequence from a std::wstring. More...
|
|
| Input (const std::wstring *wstring) |
| Construct input character sequence from a pointer to a std::wstring. More...
|
|
| Input (FILE *file) |
| Construct input character sequence from an open FILE* file descriptor, supports UTF-8 conversion from UTF-16 and UTF-32, use stdin if file == NULL. More...
|
|
| Input (std::istream &istream) |
| Construct input character sequence from a std::istream. More...
|
|
| Input (std::istream *istream) |
| Construct input character sequence from a pointer to a std::istream, use stdin if istream == NULL. More...
|
|
| operator const char * () |
| Cast this Input object to string. More...
|
|
| operator const wchar_t * () |
| Cast this Input object to wide character string. More...
|
|
| operator FILE * () |
| Cast this Input object to file descriptor FILE*. More...
|
|
| operator std::istream * () |
| Cast this Input object to std::istream*. More...
|
|
| operator bool () |
|
const char * | cstring (void) |
| Get the remaining string of this Input object. More...
|
|
const wchar_t * | wstring (void) |
| Get the remaining wide character string of this Input object. More...
|
|
FILE * | file (void) |
| Get the FILE* of this Input object. More...
|
|
std::istream * | istream (void) |
| Get the std::istream of this Input object. More...
|
|
size_t | size (void) |
| Get the size of the input character sequence in number of ASCII/UTF-8 bytes (zero if size is not determinable from a FILE* or std::istream source). More...
|
|
bool | good (void) |
| Check if input is available. More...
|
|
bool | eof (void) |
| Check if input reached EOF. More...
|
|
size_t | get (char *s, size_t n) |
| Copy character sequence data into buffer. More...
|
|
void | file_encoding (short enc) |
| Set encoding for FILE* input to Const::plain, Const::utf16be, Const::utf16le, Const::utf32be, or Const::utf32le. File encodings are automatically detected by the presence of a UTF BOM in the file. This function may be used when a BOM is not present and file encoding is known or to override the BOM. More...
|
|
short | file_encoding (void) const |
| Get encoding of the current FILE* input, Const::plain, Const::utf16be, Const::utf16le, Const::utf32be, or Const::utf32le. More...
|
|
Input character sequence class for unified access to sources of input.
Description
The Input class unifies access to a source of input of a character sequence as follows:
- An Input object is instantiated and (re)assigned a (new) source input: either a
char*
string, a wchar_t*
wide string, a std::string
, a std::wstring
, a FILE*
descriptor, or a std::istream
object.
- When assigned a wide string source as input, the wide character content is automatically converted to an UTF-8 character sequence when reading with get().
- When assigned a
FILE*
source as input, the file is checked for the presence of a UTF-8 or a UTF-16 BOM (Byte Order Mark). A UTF-8 BOM is ignored and will not appear on the input character stream (and size is adjusted by 3 bytes). A UTF-16 BOM is intepreted, resulting in the conversion of the file content automatically to an UTF-8 character sequence when reading the file with get(). Also, size() gives the content size in the number of UTF-8 bytes.
- An input object can be reassigned a new source of input for reading at any time.
- An input object obeys move semantics. That is, after assigning an input object to another, the former can no longer be used to read input. This prevents adding the overhead and complexity of file and stream duplication.
size_t Input::get(char *buf, size_t len);
reads source input and fills buf
with up to len
bytes, returning the number of bytes read or zero when a stream or file is bad or when EOF is reached.
size_t Input::size(void);
returns the number of ASCII/UTF-8 bytes available to read from the source input or zero (zero is also returned when the size is not determinable). Use this function only before reading input with get(). Wide character strings and UTF-16 FILE*
content is counted as the total number of UTF-8 bytes that will be produced by get(). The size of a std::istream
cannot be determined.
bool Input::good(void);
returns true if the input is readable and has no EOF or error state. Returns false on EOF or if an error condition is present.
bool Input::eof(void);
returns true if the input reached EOF. Note that good() == ! eof() for string source input only, since files and streams may have error conditions that prevent reading. That is, for files and streams eof() implies good() == false, but not vice versa. Thus, an error is diagnosed when the condition good() == false && eof() == false holds. Note that get(buf, len) == 0 && len > 0 implies good() == false.
- Compile with
WITH_UTF8_UNRESTRICTED
to enable unrestricted UTF-8 beyond U+10FFFF, permitting lossless UTF-8 encoding of 32 bit words without limits.
Example
The following example shows how to read a character sequence in blocks from a std::ifstream
:
std::ifstream ifs;
ifs.open("input.h", std::ifstream::in);
char buf[1024];
size_t len;
while ((len = input.get(buf, sizeof(buf))) > 0)
fwrite(buf, 1, len, stdout);
if (!input.eof())
std::cerr << "An IO error occurred" << std::endl;
ifs.close();
Example
The following example shows how to buffer the entire content of a file:
if (!input.file())
size_t len = input.size();
char *buf = new char[len];
input.get(buf, len);
if (!input.eof())
std::cerr << "An IO error occurred" << std::endl;
fwrite(buf, 1, len, stdout);
delete[] buf;
fclose(input.file());
Files with UTF-16 content are converted to UTF-8 by get(buf, len), where size() gives the total number of UTF-8 bytes that will be produced by get(buf, len).
Example
The following example shows how to read a character sequence in blocks from a file:
char buf[1024];
size_t len;
while ((len = input.get(buf, sizeof(buf))) > 0)
fwrite(buf, 1, len, stdout);
fclose(input);
Example
The following example shows how to echo characters one by one from stdin (reading input from a tty):
char c;
while (input.get(&c, 1))
fputc(c, stdout);
Example
The following example shows how to read a character sequence in blocks from a wide character string while converting it to UTF-8:
char buf[8];
size_t len;
while ((len = input.get(buf, sizeof(buf))) > 0)
fwrite(buf, 1, len, stdout);
Example
The following example shows how to convert a wide character string to UTF-8:
size_t len = input.size();
char *buf = new char[len];
input.get(buf, len);
fwrite(buf, 1, len, stdout);
Example
The following example shows how to switch source inputs while reading input byte by byte (use a buffer as shown in other examples to improve efficiency):
std::string message;
char c;
message.append(c);
input = L" world! To ∞ and beyond.";
message.append(c);