Your Ad Here!!

Character Sets and Encoding in HTML

Xah Lee, 2005-12

In HTML, you can declare the Character Set for the file. Like this:

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">

If you don't understand what is Character Set and Encoding, see this essay: The Journey of a Foreign Character thru Internet.

Once you declared your character set, you can have characters from that character set in your html file. There is a character set standand called Unicode, which contains basically all the world's language's characters, including the thousands of Chinese characters. Here is a sample of characters from Unicode:

€£¥ ©®™¶ † ‡“”—‘’ éåøèü θπαβγλ →←↑↓↔↗ ■□•‣♥★☆ ±≤≥≠≈ ∞∆° ℂℝℚℙℤ ∀∃ ∫∑∏≔⊂⊃⊆⊇∈ ⊕⊗ 한국어 ひらがな カタカナ العربية русский 李杀网

For more examples, see: Unicode Characters Example.

Using Character Entity

Another way to show special characters in your file is by so-called “character entity”. For example, the bullet symbol • is unicode character number 8226. In HTML, you can write it as “&#8226;”. Here's what your browser shows: •

The number 8226 in hexadecimal is 2022. Sometimes you only knew the hexadecimal form. You can write it using hexadecimal like this “&#x2022;”. Here's what your browser shows: •

For some commonly used character, HTML provides named entity for them. For example, the bullet character can be written as “&bull;”. Here's what your browser shows: •

See: List of XML and HTML character entity references.

References and Notes:

Bookmark and Share
2005-05
© 2005 by Xah Lee.