< Previous article Next article >
HOWTO: Map an ANSI keyboard to the same Unicode code-points
NOTE: This archived documentation has not been updated recently and may contain information that is no longer relevant
In some situations, a keyboard developer may wish to retain a legacy encoding for a keyboard. However, for compatibility reasons, they may wish to update the keyboard to use Unicode characters internally. The following table shows how to map the ANSI (Windows Western European, cp1252) characters to Unicode characters, without changing the legacy font mapping. Important Note: This does not translate an ANSI keyboard into a true Unicode keyboard. However, it does provide a pathway for transition where an application does not support ANSI input correctly. One application that has had reported issues with ANSI input in this way is Paratext 6.
This table is based on data from (http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT')
ANSI Value | Unicode Value | Unicode Character Name | |
---|---|---|---|
x80 | d128 | U+20AC | EURO_SIGN |
x81 | d129 | * | |
x82 | d130 | U+201A | SINGLE_LOW-9_QUOTATION_MARK |
x83 | d131 | U+0192 | LATIN_SMALL_LETTER_F_WITH_HOOK |
x84 | d132 | U+201E | DOUBLE_LOW-9_QUOTATION_MARK |
x85 | d133 | U+2026 | HORIZONTAL_ELLIPSIS |
x86 | d134 | U+2020 | DAGGER |
x87 | d135 | U+2021 | DOUBLE_DAGGER |
x88 | d136 | U+02C6 | MODIFIER_LETTER_CIRCUMFLEX_ACCENT |
x89 | d137 | U+2030 | PER_MILLE_SIGN |
x8A | d138 | U+0160 | LATIN_CAPITAL_LETTER_S_WITH_CARON |
x8B | d139 | U+2039 | SINGLE_LEFT-POINTING_ANGLE_QUOTATION_MARK |
x8C | d140 | U+0152 | LATIN_CAPITAL_LIGATURE_OE |
x8D | d141 | * | |
x8E | d142 | U+017D | LATIN_CAPITAL_LETTER_Z_WITH_CARON |
x8F | d143 | * | |
x90 | d144 | * | |
x91 | d145 | U+2018 | LEFT_SINGLE_QUOTATION_MARK |
x92 | d146 | U+2019 | RIGHT_SINGLE_QUOTATION_MARK |
x93 | d147 | U+201C | LEFT_DOUBLE_QUOTATION_MARK |
x94 | d148 | U+201D | RIGHT_DOUBLE_QUOTATION_MARK |
x95 | d149 | U+2022 | BULLET |
x96 | d150 | U+2013 | EN_DASH |
x97 | d151 | U+2014 | EM_DASH |
x98 | d152 | U+02DC | SMALL_TILDE |
x99 | d153 | U+2122 | TRADE_MARK_SIGN |
x9A | d154 | U+0161 | LATIN_SMALL_LETTER_S_WITH_CARON |
x9B | d155 | U+203A | SINGLE_RIGHT-POINTING_ANGLE_QUOTATION_MARK |
x9C | d156 | U+0153 | LATIN_SMALL_LIGATURE_OE |
x9D | d157 | * | |
x9E | d158 | U+017E | LATIN_SMALL_LETTER_Z_WITH_CARON |
x9F | d159 | U+0178 | LATIN_CAPITAL_LETTER_Y_WITH_DIAERESIS |
- The 5 characters x81, x8D, x8F, x90 and x9D do not have a mapping to Unicode. For compatibility reasons, you should avoid using these characters even with pure ANSI keyboards, as some applications may strip them from your data.
Example
- The following rule is in an ANSI keyboard with a legacy font.
x9A + 'm' > x9B c 'aa' + 'm' -> 'am'
- That would be converted, according to the table above, as follows:
U+0161 + 'm' > U+203A c 'aa' + 'm' -> 'am'
Applies to:
- Keyman Developer 5.0
- Keyman Developer Professional 6.0
- Keyman Developer Professional 6.2
- Keyman Developer Standard 6.0
- Keyman Developer Standard 6.2
- Keyman Developer Professional 7.0