Combining character

From Free net encyclopedia

(Redirected from Combining diacritic marks)

Combining characters are characters that are intended to modify other characters. The most common combining characters in the latin script are the combining diacritical marks (including combining accents). In Unicode the main block of combining diacritics for European languages and the International Phonetic Alphabet is U+0300–U+036F. Combining diacritical marks are also present in many other blocks of Unicode characters. In Unicode, diacritics are always added after the main character. It is possible to add several diacritics to the same character.

Unicode also contains many precomposed characters. So in many cases it is possible to use both combining diacritics and precomposed characters, at the user or applications choice. This leads to a requirement to perform unicode normalisation before comparing two unicode strings and to carefully design encoding converters to correctly map all of the valid ways to represent a character in unicode to a legacy encoding to avoid data loss. For example, when converting between windows-1258 and VISCII, the former uses combining diacritics whilst the other has a large selection of precomposed characters so a converter using a simple mapping between code values and unicode code points will mess up text when converting between them.

See also

External links