Main > Reference Manual > Information for developers > Internationalization

This section contains some notes on topics relevant for the internationalization.

  1. It is highly recommended to use UTF-8 as the encoding of character data.
    Using it consistently for your locale, XML-files and databases helps in avoiding headaches and sleepless nights.
  2. When creating a database for the Django user interface, make sure the character encoding properly support utf-8.
    When using MySQL, this is easiest don by setting the database parameter "default character set" to "utf-8" and "default collate" to "utf8_general_ci". Another pitfall when using frePPLe with MySQL is that the string comparison in some MySQL collations is case insensitive, which frePPLe always handles data strings in a case sensitive way.
    When using Oracle, this is controlled through the database "character set" and "national character set".
    PostgreSQL provides the 'encoding' setting on the database.
    SQLite is unicode-ready by default.
  3. Xerces-C will transcode the input XML data from the input encoding (typically specified with a <?xml version="1.0" encoding="UTF-8" ?> header line) to the locale of your *nix shell or Windows environment.
    Xerces-C has intrinsic support for ASCII, UTF-8, UTF-16 (Big/Small Endian), UTF-32(Big/Small Endian), EBCDIC code pages IBM037, IBM1047 and IBM1140 encodings, ISO-8859-1 (aka Latin1) and Windows-1252.
    This means that it can parse input XML files in these encodings. For more exotic encodings a special configuration and compilation is required: see the Xerces-C documentation for more details.
  4. Internally frePPLe stores string data in the locale of your environment: see the documentation on the setlocale C function.
    For most modern Linux distributions the default setting is a UTF-8 encoded locale, meaning that every unicode character can be represented. The environment variable LC_ALL can be used to specify a suitable locale.
    On windows the default locale is some ANSI default codepage (which can represent a limited set of characters only).
  5. When exporting data out of frePPLe, no data conversion to specific encodings is done.
    All output will be in the locale of your environment.
  6. FrePPLe internally uses byte-based string manipulation routines, not character-based.
    For UTF-8 encoding and the single-byte codepages this works fine, but with multi-byte encodings such UTF-16 and UTF-32 this won't work any more. Such encodings are NOT supported by frePPLe.