Character Encoding

ASCII

Advantages

  • Is the simplest way to encode the characters.

  • Doesn’t have endianness problems – uses 1 byte at a time original version uses only 7 bits – 128 different characters.

  • It takes up less storage space than unicode.

Disadvantages

  • Very limited capacity to represent non-English characters.

  • There are several extended versions – 8 bits (ISO 8859).

Unicode

Advantages

  • The latest version has more than 137 000 characters.

  • Covers 146 modern and historic scripts, as well as multiple symbol sets.

Disadvantages

  • There are several implementations:

    • UTF-8: compatible with ASCII, has variable length from 1 to 4 bytes and is used in *nix systems, www, HTML, etc.

    • UTF-16: has variable length from 2 to 4 bytes, is used in Windows, Mac OS, Java, .Net, KDE, etc.

    • UTF-32: fixed length of 4 bytes.

  • More complex processing.

Last updated