Windows-1252
From Free net encyclopedia
The legacy components of Microsoft Windows in English and some other Western languages use, by default, an encoding that is a superset of ISO 8859-1, but differs by using displayable characters rather than control characters in the 0x80 to 0x9F range. This encoding is known to Windows by the code page number 1252, IANA-approved name windows-1252. This code page also contains all the printable characters that are in ISO 8859-15.
Many web browsers treat the MIME charset ISO-8859-1 as Windows-1252 (the extra control codes in ISO-8859-1 are forbidden in HTML anyway), and so codes from it are often seen in web pages that declare their encoding as ISO-8859-1. This is also true of e-mail programs. However, there can be difficulties from the use of such characters, particularly when the recipient is using a non-Windows system such as Linux or MacOS, which may have assigned no meaning or a different proprietary set of characters to this range.
A popular misconception is that the term "ANSI code page", which is used in the Microsoft Windows documentation, is synonymous with this code page. In fact, there exists no ANSI standard describing this code page; the closest existing ANSI standard is ANSI ISO 8859-1. Instead, the Windows documentation uses the term "ANSI code page" to refer to the system's current 8-bit GUI code page (as opposed to the OEM code page used for console apps and some other functions), because this once was fixed to ANSI ISO 8859-1 in a very early version of Windows. However today, the system's ANSI code page will be 1252 only in locale versions for Western European languages (English, Spanish, German, French, etc.).
The following table shows Windows-1252, with changes from ISO-8859-1 highlighted:
Windows-1252 (CP1252) | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | |
0x | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | TAB | LF | VT | FF | CR | SO | SI |
1x | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US |
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | } | ~ | DEL | |
8x | € | ‚ | ƒ | „ | … | † | ‡ | ˆ | ‰ | Š | ‹ | Œ | Ž | |||
9x | ‘ | ’ | “ | ” | • | – | — | ˜ | ™ | š | › | œ | ž | Ÿ | ||
Ax | NBSP | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | SHY | ® | ¯ |
Bx | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
Cx | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï |
Dx | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß |
Ex | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï |
Fx | ð | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | ú | û | ü | ý | þ | ÿ |
According to the information on Microsoft's and the Unicode Consortium's websites positions 81, 8D, 8F, 90, and 9D are unused. However the Windows API call for converting from codepages to Unicode maps these to the corresponding C1 control codes. The euro character at position 80 was not present in earlier versions of this code page, nor were the S and Z with caron (háček).