Spaces in HTML documents

This page demonstrates the use of the various kinds of spaces which are available in plain text, HTML and Unicode.  It can also be used to demonstrate how a particular Web browser handles all of those different spaces, and how the choice of a display font within that browser affects the appearance of those spaces.  To make comparisons easy, each kind of space is demonstrated on a single line which is centered on the page.  That single line is first shown following the definition of a particular space character, and is then repeated in various comparison blocks.

The space in plain text

The "normal" space is the glyph-less (i.e., blank) character which corresponds to the space bar in all common keyboard definitions.  In fonts which are "proportionately spaced" (as all pre-typewriter fonts were), the width of the space should be the same as the width of the letter "n".  However, most designers of computer-based fonts failed to follow this typographic principle, and made the space character narrower, as shown here:

normal space > < normal space
letter "n" >n< letter "n"

In the default browser font on the computer where this page was developed, the center space above appears to be four pixels wide, while the letter "n" appears to be six pixels wide.  However, in this case the "n" is preceded by an additional pixel for inter-letter spacing, which was added by the display software, so it could also be considered as seven pixels wide.  Whichever width is observed, it is clear that the width of the space is distinctly less than the width of the "n" in the same font.
(Inter-letter spacing is normally not inserted when setting type by hand, because it is inherent in the design of each character.  Whether it happens for a particular pair of characters on the computer appears to depend on the shapes of those characters - primarily whether they would otherwise touch each other.  The effects of computerized inter-letter spacing are unimportant for this essay.
Changing the standard display font on your browser will undoubtedly show differing amounts of disparity between the width of the space and the width of the "n".  Examples from the author's experiments are given under "Browser notes" below.

Spaces as HTML named entities

HTML entities are a type of markup language construct which is intended to avoid the problems inherent in the translation of computer codes from one character encoding system to another.  There are two types of entities - named and numeric.

An HTML named entity consists of a mixed-case name which is preceded by an ampersand (&) and followed by a semicolon (;).  Web browsers which "understand" a particular named entity will replace it with the appropriate glyph or typographic function; others will simply display it as a character string.  View this page with an old browser (v.4 or earlier) to see examples of such behavior.

The Hypertext Markup Language (HTML), as used by Web browsers, treats spaces as one of several kinds of "whitespace".  Ordinarily, browsers collapse sequences of whitespace definitions into a single space, and nearly every kind of whitespace also indicates a point at which a line of text may be wrapped to fit into an open window.  An example of whitepace collapse follows:

one normal space > < one normal space
two normal spaces > < two normal spaces

The HTML entity "&nbsp;" is the glyph-less character for a non-breaking space.  The distinguishing features of this character are twofold: it is not collapsed together with adjacent whitespace, and it is not a point at which a line of text may be wrapped.

non-breaking space > < non-breaking space
two non-breaking spaces >  < two non-breaking spaces
three non-breaking spaces >   < three non-breaking spaces

The HTML entity "&ensp;" is the glyph-less character which corresponds to the classical "en space" of hand-set typography - a space whose width should be the same as the width of the letter "n" in the current font.

en space > < en space
letter "n" >n< letter "n"

The HTML entity "emsp" is the glyph-less character which corresponds to the classical "em space" of hand-set typography - a space whose width should be the same as the width of the letter "m" in the current font.

em space > < em space
letter "m" >m< letter "m"

The HTML entity "thinsp" is the glyph-less character which corresponds to a classical "thin space" of hand-set typography - a space whose width is clearly less than that of of the en space in the current font.

thin space > < thin space
letter "m" >m< letter "m"

Comparison of HTML named entities for spaces

Whether any of these HTML space entities are actually displayed at the widths which their definitions imply is a matter of browser implementation.  If they are, then each of the following four lines should show a centered space which is different in width from the other three lines:

non-breaking space > < non-breaking space
en space > < en space
em space > < em space
thin space > < thin space

Spaces as Unicode points

Unicode is an enormous code-translation table which is intended to provide the capability for unambiguous representation of every glyph which is used in written human languages.  For the purposes of this page, it is sufficient to state that every Unicode character has a name and can be represented in HTML as a numeric entity.  (In actual practice, HTML numeric entities are almost never used for characters which have direct representation in the native character set of one's own computer, and are not often used for characters which have equivalend HTML named entities.)  Each HTML numeric entity consists of a decimal number of one to five digits, preceded by an ampersand (&) and hash mark (#) and followed by a semicolon (;). 

The HTML numeric entities which correspond to various space definitions within the Unicode "General Punctuation" block, and which are not expressible as HTML named entities, are as follows:

Three-Per-Em (thick) Space (&#8196;) > < Three-Per-Em (thick) Space (&#8196;)
Four-Per-Em (mid) Space (&#8197;) > < Four-Per-Em (mid) Space (&#8197;)
Six-Per-Em Space (&#8198;) > < Six-Per-Em Space (&#8198;)
Figure Space (&#8199;) > < Figure Space (&#8199;)
Punctuation Space (&#8200;) > < Punctuation Space (&#8200;)
Hair Space (&#8202;) > < Hair Space (&#8202;)
Zero Width Space (&#8203;) >​< Zero Width Space (&#8203;)
Narrow No-Break Space (&#8239;) > < Narrow No-Break Space (&#8239;)

Not shown are the Zero Width Non Joiner (&#8204;), the Zero Width Joiner (&#8205;), the Narrow No-Break Space (&#8239;), the Word Joiner (&#8288;), and the Zero Width No-Break Space (&#65279;).

The presence of question marks (?) in place of centered spaces in the list above indicates that your browser cannot handle Unicode points of the specified values.  Garbling of part of the line involving one of the centering angle brackets indicates that your browser has an incorrect

All kinds of spaces compared

The spaces described above are shown here in the order of their Unicode point number.  Those which have HTML named entities are shown not only as above but also as the equivalent HTML numeric entities, to check whether the browser treats the two forms as representing the same character.

normal space > < normal space
Space (&#32;) > < Space (&#32;)
No-Break Space > < No-Break Space
No-Break Space (&#160;) > < No-Break Space (&#160;)
En Space > < En Space
En Space (&#8194;) > < En Space (&#8194;)
Em Space > < Em Space
Em Space (&#8195;) > < Em Space (&#8195;)
Three-Per-Em (thick) Space (&#8196;) > < Three-Per-Em (thick) Space (&#8196;)
Four-Per-Em (mid) Space (&#8197;) > < Four-Per-Em (mid) Space (&#8197;)
Six-Per-Em Space (&#8198;) > < Six-Per-Em Space (&#8198;)
Figure Space (&#8199;) > < Figure Space (&#8199;)
Punctuation Space (&#8200;) > < Punctuation Space (&#8200;)
Thin Space > < Thin Space
Thin Space (&#8201;) > < Thin Space (&#8201;)
Hair Space (&#8202;) > < Hair Space (&#8202;)
Zero Width Space (&#8203;) >​< Zero Width Space (&#8203;)
Narrow No-Break Space (&#8239;) > < Narrow No-Break Space (&#8239;)

Browser notes

When this page was developed, it was tested on two Web browers, Netscape Navigator v.4 (NN) and Internet Explorer v.5 (IE).  The default font in both was 12 point Times, a standard serif design.  Font-related effects common to both browsers are described above.  Browser-based differences were observed as follows:

The only HTML space entity which NN 4 recognizes is nbsp.  IE 5 recognizes all four of the HTML space entities.

In the comparison of 1,2,3 non-breaking spaces, the pixel widths of those spaces are 4, 7 and 10, respectively, in NN 4, but 7,13,19 in IE 5.  This leads to the conclusions that (1) for NN, the actual widths of both the non-breaking space and the normal space are three pixels; (2) for IE, the actual widths of the non-breaking space is six pixels, while the width of the normal space is three pixels.  So in the default font where this page was developed, the normal space width is half of an en space width. 

NN 4 displays the non-breaking space at the same width as a normal space (3+1 pixels).  IE 5 displays it at proper en-space width (6+1 pixels), but displays the en, em and thin spaces at the same width as a normal space.

NN4 displays all HTML numeric entities as question marks.  IE5 displays the first five HTML numeric entities as ordinary (single) spaces, displays the next two HTML numeric entities as question marks, and garbles the last one in a way which destroys four characters and throws the intended centering out of alignment.

NN4 has the most limited capabilities with regard to additional space definitions, but it does correctly present the normal space and non-breaking space with the same width.  IE5 has more extensive (though not complete or accurate) capabilities with regard to additional space definitions, but it incorrectly present the normal space and non-breaking space with different widths.