HTML Encoding (Character Sets)
Posted: Fri Oct 27, 2023 8:06 am
HTML Encoding (Character Sets)
To display an HTML page correctly, a web browser must know which character set to use.
The HTML charset Attribute
The character set is specified in the <meta> tag:
Example
<meta charset="UTF-8">
The HTML5 specification encourages web developers to use the UTF-8 character set.
UTF-8 covers almost all of the characters and symbols in the world!
Full UTF-8 Reference
The ASCII Character Set
ASCII was the first character encoding standard for the web.
It defined 128 different characters that could be used on the internet:
English letters (A-Z)
Numbers (0-9)
Special characters like ! $ + - ( ) @ < >.
The ANSI Character Set
ANSI (Windows-1252) was the original Windows character set:
Identical to ASCII for the first 127 characters
Special characters from 128 to 159
Identical to UTF-8 from 160 to 255
<meta charset="Windows-1252">
The ISO-8859-1 Character Set
ISO-8859-1 was the default character set for HTML 4. This character set
supported 256 different character codes. HTML 4 also supported UTF-8.
Identical to ASCII for the first 127 characters
Does not use the characters from 128 to 159
Identical to ANSI and UTF-8 from 160 to 255
HTML 4 Example
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
HTML 5 Example
<meta charset="ISO-8859-1">
The UTF-8 Character Set
is identical to ASCII for the values from 0 to 127
Does not use the characters from 128 to 159
Identical to ANSI and 8859-1 from 160 to 255
Continues from the value 256 to 10 000 characters
<meta charset="UTF-8">
Full HTML Character Set Reference.
Differences Between Character Sets
The following table displays the differences between the character sets described above:
Numb
ASCII
ANSI
8859
UTF‑8
Description
32 space
33!!!!exclamation mark
34""""quotation mark
35####number sign
36$$$$dollar sign
37%%%%percent sign
38&&&&ersand
39''''apostrophe
40((((left parenthesis
41))))right parenthesis
42****asterisk
43++++plus sign
44,,,,comma
45----hyphen-minus
46....full stop
47////solidus
480000digit zero
491111digit one
502222digit two
513333digit three
524444digit four
535555digit five
546666digit six
557777digit seven
568888digit eight
579999digit nine
58::::colon
59;;;;semicolon
60<<<<less than
61====equals sign
62>>>>greater than
63????question mark
64@@@@commercial at
65AAAALatin A
66BBBBLatin B
67CCCCLatin C
68DDDDLatin D
69EEEELatin E
70FFFFLatin F
71GGGGLatin G
72HHHHLatin H
73IIIILatin I
74JJJJLatin J
75KKKKLatin K
76LLLLLatin L
77MMMMLatin M
78NNNNLatin N
79OOOOLatin O
80PPPPLatin P
81QQQQLatin Q
82RRRRLatin R
83SSSSLatin S
84TTTTLatin T
85UUUULatin U
86VVVVLatin V
87WWWWLatin W
88XXXXLatin X
89YYYYLatin Y
90ZZZZLatin Z
91[[[[left square bracket
92\\\\reverse solidus
93]]]]right square bracket
94^^^^circumflex accent
95____low line
96````grave accent
97aaaaLatin small a
98bbbbLatin small b
99ccccLatin small c
100ddddLatin small d
101eeeeLatin small e
102ffffLatin small f
103ggggLatin small g
104hhhhLatin small h
105iiiiLatin small i
106jjjjLatin small j
107kkkkLatin small k
108llllLatin small l
109mmmmLatin small m
110nnnnLatin small n
111ooooLatin small o
112ppppLatin small p
113qqqqLatin small q
114rrrrLatin small r
115ssssLatin small s
116ttttLatin small t
117uuuuLatin small u
118vvvvLatin small v
119wwwwLatin small w
120xxxxLatin small x
121yyyyLatin small y
122zzzzLatin small z
123{{{{left curly bracket
124||||vertical line
125}}}}right curly bracket
126~~~~tilde
127DEL
128 € euro sign
129 NOT USED
130 ‚ single low-9 quotation mark
131 ƒ Latin small f with hook
132 „ double low-9 quotation mark
133 … horizontal ellipsis
134 † dagger
135 ‡ double dagger
136 ˆ modifier letter circumflex accent
137 ‰ per mille sign
138 Š Latin S with caron
139 ‹ single left-pointing angle quotation mark
140 Œ Latin capital ligature OE
141 NOT USED
142 Ž Latin Z with caron
143 NOT USED
144 NOT USED
145 ‘ left single quotation mark
146 ’ right single quotation mark
147 “ left double quotation mark
148 ” right double quotation mark
149 • bullet
150 – en dash
151 — em dash
152 ˜ small tilde
153 ™ trade mark sign
154 š Latin small s with caron
155 › single right-pointing angle quotation mark
156 œ Latin small ligature oe
157 NOT USED
158 ž Latin small z with caron
159 Ÿ Latin Y with diaeresis
160 no-break space
161 ¡¡¡inverted exclamation mark
162 ¢¢¢cent sign
163 £££pound sign
164 ¤¤¤currency sign
165 ¥¥¥yen sign
166 ¦¦¦broken bar
167 §§§section sign
168 ¨¨¨diaeresis
169 ©©©copyright sign
170 ªªªfeminine ordinal indicator
171 «««left-pointing double angle quotation mark
172 ¬¬¬not sign
173 soft hyphen
174 ®®®registered sign
175 ¯¯¯macron
176 °°°degree sign
177 ±±±plus-minus sign
178 ²²²superscript two
179 ³³³superscript three
180 ´´´acute accent
181 µµµmicro sign
182 ¶¶¶pilcrow sign
183 ···middle dot
184 ¸¸¸cedilla
185 ¹¹¹superscript one
186 ºººmasculine ordinal indicator
187 »»»right-pointing double angle quotation mark
188 ¼¼¼vulgar fraction one quarter
189 ½½½vulgar fraction one half
190 ¾¾¾vulgar fraction three quarters
191 ¿¿¿inverted question mark
192 ÀÀÀLatin A with grave
193 ÁÁÁLatin A with acute
194 ÂÂÂLatin A with circumflex
195 ÃÃÃLatin A with tilde
196 ÄÄÄLatin A with diaeresis
197 ÅÅÅLatin A with ring above
198 ÆÆÆLatin AE
199 ÇÇÇLatin C with cedilla
200 ÈÈÈLatin E with grave
201 ÉÉÉLatin E with acute
202 ÊÊÊLatin E with circumflex
203 ËËËLatin E with diaeresis
204 ÌÌÌLatin I with grave
205 ÍÍÍLatin I with acute
206 ÎÎÎLatin I with circumflex
207 ÏÏÏLatin I with diaeresis
208 ÐÐÐLatin Eth
209 ÑÑÑLatin N with tilde
210 ÒÒÒLatin O with grave
211 ÓÓÓLatin O with acute
212 ÔÔÔLatin O with circumflex
213 ÕÕÕLatin O with tilde
214 ÖÖÖLatin O with diaeresis
215 ×××multiplication sign
216 ØØØLatin O with stroke
217 ÙÙÙLatin U with grave
218 ÚÚÚLatin U with acute
219 ÛÛÛLatin U with circumflex
220 ÜÜÜLatin U with diaeresis
221 ÝÝÝLatin Y with acute
222 ÞÞÞLatin Thorn
223 ßßßLatin small sharp s
224 àààLatin small a with grave
225 áááLatin small a with acute
226 âââLatin small a with circumflex
227 ãããLatin small a with tilde
228 äääLatin small a with diaeresis
229 åååLatin small a with ring above
230 æææLatin small ae
231 çççLatin small c with cedilla
232 èèèLatin small e with grave
233 éééLatin small e with acute
234 êêêLatin small e with circumflex
235 ëëëLatin small e with diaeresis
236 ìììLatin small i with grave
237 íííLatin small i with acute
238 îîîLatin small i with circumflex
239 ïïïLatin small i with diaeresis
240 ðððLatin small eth
241 ñññLatin small n with tilde
242 òòòLatin small o with grave
243 óóóLatin small o with acute
244 ôôôLatin small o with circumflex
245 õõõLatin small o with tilde
246 öööLatin small o with diaeresis
247 ÷÷÷division sign
248 øøøLatin small o with stroke
249 ùùùLatin small u with grave
250 úúúLatin small u with acute
251 ûûûLatin small with circumflex
252 üüüLatin small u with diaeresis
253 ýýýLatin small y with acute
254 þþþLatin small thorn
255 ÿÿÿLatin small y with diaeresis
★
+1
Reference: https://www.w3schools.com/html/html_charset.asp
To display an HTML page correctly, a web browser must know which character set to use.
The HTML charset Attribute
The character set is specified in the <meta> tag:
Example
<meta charset="UTF-8">
The HTML5 specification encourages web developers to use the UTF-8 character set.
UTF-8 covers almost all of the characters and symbols in the world!
Full UTF-8 Reference
The ASCII Character Set
ASCII was the first character encoding standard for the web.
It defined 128 different characters that could be used on the internet:
English letters (A-Z)
Numbers (0-9)
Special characters like ! $ + - ( ) @ < >.
The ANSI Character Set
ANSI (Windows-1252) was the original Windows character set:
Identical to ASCII for the first 127 characters
Special characters from 128 to 159
Identical to UTF-8 from 160 to 255
<meta charset="Windows-1252">
The ISO-8859-1 Character Set
ISO-8859-1 was the default character set for HTML 4. This character set
supported 256 different character codes. HTML 4 also supported UTF-8.
Identical to ASCII for the first 127 characters
Does not use the characters from 128 to 159
Identical to ANSI and UTF-8 from 160 to 255
HTML 4 Example
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
HTML 5 Example
<meta charset="ISO-8859-1">
The UTF-8 Character Set
is identical to ASCII for the values from 0 to 127
Does not use the characters from 128 to 159
Identical to ANSI and 8859-1 from 160 to 255
Continues from the value 256 to 10 000 characters
<meta charset="UTF-8">
Full HTML Character Set Reference.
Differences Between Character Sets
The following table displays the differences between the character sets described above:
Numb
ASCII
ANSI
8859
UTF‑8
Description
32 space
33!!!!exclamation mark
34""""quotation mark
35####number sign
36$$$$dollar sign
37%%%%percent sign
38&&&&ersand
39''''apostrophe
40((((left parenthesis
41))))right parenthesis
42****asterisk
43++++plus sign
44,,,,comma
45----hyphen-minus
46....full stop
47////solidus
480000digit zero
491111digit one
502222digit two
513333digit three
524444digit four
535555digit five
546666digit six
557777digit seven
568888digit eight
579999digit nine
58::::colon
59;;;;semicolon
60<<<<less than
61====equals sign
62>>>>greater than
63????question mark
64@@@@commercial at
65AAAALatin A
66BBBBLatin B
67CCCCLatin C
68DDDDLatin D
69EEEELatin E
70FFFFLatin F
71GGGGLatin G
72HHHHLatin H
73IIIILatin I
74JJJJLatin J
75KKKKLatin K
76LLLLLatin L
77MMMMLatin M
78NNNNLatin N
79OOOOLatin O
80PPPPLatin P
81QQQQLatin Q
82RRRRLatin R
83SSSSLatin S
84TTTTLatin T
85UUUULatin U
86VVVVLatin V
87WWWWLatin W
88XXXXLatin X
89YYYYLatin Y
90ZZZZLatin Z
91[[[[left square bracket
92\\\\reverse solidus
93]]]]right square bracket
94^^^^circumflex accent
95____low line
96````grave accent
97aaaaLatin small a
98bbbbLatin small b
99ccccLatin small c
100ddddLatin small d
101eeeeLatin small e
102ffffLatin small f
103ggggLatin small g
104hhhhLatin small h
105iiiiLatin small i
106jjjjLatin small j
107kkkkLatin small k
108llllLatin small l
109mmmmLatin small m
110nnnnLatin small n
111ooooLatin small o
112ppppLatin small p
113qqqqLatin small q
114rrrrLatin small r
115ssssLatin small s
116ttttLatin small t
117uuuuLatin small u
118vvvvLatin small v
119wwwwLatin small w
120xxxxLatin small x
121yyyyLatin small y
122zzzzLatin small z
123{{{{left curly bracket
124||||vertical line
125}}}}right curly bracket
126~~~~tilde
127DEL
128 € euro sign
129 NOT USED
130 ‚ single low-9 quotation mark
131 ƒ Latin small f with hook
132 „ double low-9 quotation mark
133 … horizontal ellipsis
134 † dagger
135 ‡ double dagger
136 ˆ modifier letter circumflex accent
137 ‰ per mille sign
138 Š Latin S with caron
139 ‹ single left-pointing angle quotation mark
140 Œ Latin capital ligature OE
141 NOT USED
142 Ž Latin Z with caron
143 NOT USED
144 NOT USED
145 ‘ left single quotation mark
146 ’ right single quotation mark
147 “ left double quotation mark
148 ” right double quotation mark
149 • bullet
150 – en dash
151 — em dash
152 ˜ small tilde
153 ™ trade mark sign
154 š Latin small s with caron
155 › single right-pointing angle quotation mark
156 œ Latin small ligature oe
157 NOT USED
158 ž Latin small z with caron
159 Ÿ Latin Y with diaeresis
160 no-break space
161 ¡¡¡inverted exclamation mark
162 ¢¢¢cent sign
163 £££pound sign
164 ¤¤¤currency sign
165 ¥¥¥yen sign
166 ¦¦¦broken bar
167 §§§section sign
168 ¨¨¨diaeresis
169 ©©©copyright sign
170 ªªªfeminine ordinal indicator
171 «««left-pointing double angle quotation mark
172 ¬¬¬not sign
173 soft hyphen
174 ®®®registered sign
175 ¯¯¯macron
176 °°°degree sign
177 ±±±plus-minus sign
178 ²²²superscript two
179 ³³³superscript three
180 ´´´acute accent
181 µµµmicro sign
182 ¶¶¶pilcrow sign
183 ···middle dot
184 ¸¸¸cedilla
185 ¹¹¹superscript one
186 ºººmasculine ordinal indicator
187 »»»right-pointing double angle quotation mark
188 ¼¼¼vulgar fraction one quarter
189 ½½½vulgar fraction one half
190 ¾¾¾vulgar fraction three quarters
191 ¿¿¿inverted question mark
192 ÀÀÀLatin A with grave
193 ÁÁÁLatin A with acute
194 ÂÂÂLatin A with circumflex
195 ÃÃÃLatin A with tilde
196 ÄÄÄLatin A with diaeresis
197 ÅÅÅLatin A with ring above
198 ÆÆÆLatin AE
199 ÇÇÇLatin C with cedilla
200 ÈÈÈLatin E with grave
201 ÉÉÉLatin E with acute
202 ÊÊÊLatin E with circumflex
203 ËËËLatin E with diaeresis
204 ÌÌÌLatin I with grave
205 ÍÍÍLatin I with acute
206 ÎÎÎLatin I with circumflex
207 ÏÏÏLatin I with diaeresis
208 ÐÐÐLatin Eth
209 ÑÑÑLatin N with tilde
210 ÒÒÒLatin O with grave
211 ÓÓÓLatin O with acute
212 ÔÔÔLatin O with circumflex
213 ÕÕÕLatin O with tilde
214 ÖÖÖLatin O with diaeresis
215 ×××multiplication sign
216 ØØØLatin O with stroke
217 ÙÙÙLatin U with grave
218 ÚÚÚLatin U with acute
219 ÛÛÛLatin U with circumflex
220 ÜÜÜLatin U with diaeresis
221 ÝÝÝLatin Y with acute
222 ÞÞÞLatin Thorn
223 ßßßLatin small sharp s
224 àààLatin small a with grave
225 áááLatin small a with acute
226 âââLatin small a with circumflex
227 ãããLatin small a with tilde
228 äääLatin small a with diaeresis
229 åååLatin small a with ring above
230 æææLatin small ae
231 çççLatin small c with cedilla
232 èèèLatin small e with grave
233 éééLatin small e with acute
234 êêêLatin small e with circumflex
235 ëëëLatin small e with diaeresis
236 ìììLatin small i with grave
237 íííLatin small i with acute
238 îîîLatin small i with circumflex
239 ïïïLatin small i with diaeresis
240 ðððLatin small eth
241 ñññLatin small n with tilde
242 òòòLatin small o with grave
243 óóóLatin small o with acute
244 ôôôLatin small o with circumflex
245 õõõLatin small o with tilde
246 öööLatin small o with diaeresis
247 ÷÷÷division sign
248 øøøLatin small o with stroke
249 ùùùLatin small u with grave
250 úúúLatin small u with acute
251 ûûûLatin small with circumflex
252 üüüLatin small u with diaeresis
253 ýýýLatin small y with acute
254 þþþLatin small thorn
255 ÿÿÿLatin small y with diaeresis
★
+1
Reference: https://www.w3schools.com/html/html_charset.asp