The Moronicities of Typography: Hyphen, Dash, Quotation Marks, Apostrophe

By Xah Lee. Date: . Last updated: .

This article discusses some issues in typography, especially those related to the dash and quotation marks.

I've had some interest in typography since early 1990s of the Mac's Desktop publishing era. Basically, i avidly read books about fontography in libraries or Mac magazines such as MacUser and Macworld, and played with fonts and math typesetting in software such as Microsoft World and Mathematica, including reading Knuth's book on typography and using his TeX system, reading about font technology such as TrueType . So, i am generally acquainted with the concepts and issues of typography, though never worked in any professional area related to it.

I'll have to say, the entire typographical efforts and establishment is rather largely a waste of time, similar in the sense that some “artistic” circles chalk up photography as high art, or that grammarians and pedants have voluminous and vociferous writing style guides and guilds.

Some of the most fartful things the typography-sensitive crowd discuss or distinguish are: hyphen, en-dash, em-dash, ligature, kerning, font “design”.

In general, the function of typography is mainly about issues in printing with respect to the facilitation of reading. So, the major issues involved are: line length, line spacing, serif and sans serif fonts, margin, font sizes, and these pretty much are about it. But since how things are rendered on paper does create differences in the sense of esthetics, sometimes rather pronounced difference, thus typography does indeed have some esthetical elements. However, this is blown out of proportion to stupendous profundity.

Hyphen and Dash

Look at these guilded morons go:

Traditionally an em dash—like so—or spaced em dash -- like so -- has been used for a dash in running text. The Elements of Typographic Style recommends the more concise spaced en dash – like so – and argues that the length and visual magnitude of an em dash “belongs to the padded and corseted aesthetic of Victorian typography”. The spaced en dash is also the house style for certain major publishers (Penguin, Cambridge University Press, and Routledge among them). However, some longstanding typographical guides such as The Chicago Manual of Style still recommend unspaced em dashes for this purpose. The Oxford Guide to Style (2002, section 5.10.10) acknowledges that …

The above is from Wikipedia Dash.

Here's my own rule regarding the use of dash: There are 2 kinds: the short dash and the long dash.

For the short one, press the “-” key on your keyboard. For the long one -- as a punctuation mark for embedded thought -- press it twice. That's it. Simple and functional. And, always include a space around them. (personally, in my writings published on my site, i replace the double dash by a em-dash “—” only because it is prettier, but don't consider it important)

The character “-” you type on your keyboard is the ASCII 45. (ASCII = American Standard Code for Information Interchange) [see ASCII Characters] The character is named “hyphen” in the ASCII standard, but is called “hyphen-minus” by Unicode. (because Unicode has now proper symbols for hyphen, figure-dash, en-dash, em-dash, (math) minus, and quite a few others. Each differ slightly in position, thickness, length.)

As to the typographer's senses and sensibilities about how figure-dash should be used for numbers and en-dash is used for ranges and em-dash is for punctuation and hyphen is for word-breaking … etc, i regard them pretty much all as trifles produced by morons whose brain is inadequate to sense or tackle the depth of logic and mathematics of languages and structures but fell into a niche of diddling and went on to procure their efforts to heighten themselfs among human animals.

Hypen and Narrow Columns

For hyphen, as in “breaking a word for words near the margin”, my general advice is to abolish such practice. But what to do in a narrow column of text? My general advice is to abolish the practice of layout using very narrow columns.

Justification

A related concept here is Typographical Justification. My general advice here is to abolish the practice of justification entirely. (leave it jagged at one end; actually as esthetically superior. (and factually functionally superior with regards to reading-facilitation))

Ligature

The typographic conventions of ligatures (as in adjoining certain letter combinations such as “fi” as a single glyph ) should also be abolished.

Quotation Marks

Related here is the quotation mark. If you read Wikipedia on Quotation mark#Summary table , you'll see that there are huge variations. Here's the characters used for quotation.

glyphUnicode namecommon name
LEFT/RIGHT DOUBLE QUOTATION MARKCurly double quote
LEFT/RIGHT SINGLE QUOTATION MARKcurly single quote
« »LEFT/RIGHT-POINTING DOUBLE ANGLE QUOTATIONFrench double quote
SINGLE LEFT/RIGHT-POINTING ANGLE QUOTATION
LEFT/RIGHT WHITE CORNER BRACKETChinese double quote
LEFT/RIGHT CORNER BRACKET
LEFT/RIGHT DOUBLE ANGLE BRACKETChinese title bracket
LEFT/RIGHT ANGLE BRACKET
LEFT/RIGHT WHITE LENTICULAR BRACKETChinese brackets
LEFT/RIGHT BLACK LENTICULAR BRACKET
DOUBLE HIGH-REVERSED-9 QUOTATION MARK
DOUBLE LOW-9 QUOTATION MARK
SINGLE LOW-9 QUOTATION MARK

Here's a list of conventions of using the double curly quotes:

Ain't it bizarre?

For some languages, such as Chinese, it is rational how it developed into using symbols (e.g. 『』「」《》〈〉【】〖〗) that are different from European languages's curly quotation marks. However, among European langs, there are extreme diversity in using the curly quotation marks. Even the American and English reverse the purpose of the single and double quotes. Some lang reverses the semantics of the left/right pair, some lang positions the mark at the bottom instead of top, some place them in opposite corners (as opposed to both on top), some lang use the same symbol for both the opening and closing marker.

One thing interesting about the curly double quotation mark pair is that the two symbols are not bilateral symmetric, but is rotational symmetric. That is, if you rotate the left one 180 degrees, you get the right one. Most other matching pair symbols ()[]{} are bilaterally symmetric (i.e. there is a horizontal line of mirror reflection, and the left/right symbols are vertical mirror reflection of each other.). The fact that the curly quotes have only rotational symmetry, must have contributed significantly the weird diversity in their role as the choice in the opening/closing mark and whether to position them level on a line or at opposite corners. (Note that the Chinese corner brackets 『』「」 also lack a bilateral symmetry, however, their box-corner shape intuitively and uniquely define their placements.)

Combinatorial Possibilities

This glyph (U+201C) points upper-right. This glyph can be mirrored in a vertical line or horizontal line to create the matching variation, a total of 4 possibilities (think of p q b d).

Here are the different pointing curly quotes from Unicode: .

In Unicode, i couldn't find one that is pointing to upper-left. This is somewhat curious. (If you look at the Wikipedia article on quotation conventions, you see that actually no language use such a char.)

I created one with image here just for the illustration: double curly quote upleft.png

The quotation mark can be placed on the upper line of the text (as in USA convention) or lower line (as in the beginning quotation mark in German convention), a total of 2 possibilities.

So, 4 choices of glyph orientation, 2 possible positions, that's 8 possibilities for the opening quote. Same for the closing quote. So, the total number of styles to use the quotation punctuation with double curly quote is 8×8=64.

It is a good thing that this hasn't been exploited.

How it should be

The function of quotation marks is to demarcate text, and as such delimiters, it should be a matching pair such as ( ) [ ] { }, and the pair should have no more than a bilateral symmetry to reflect the natural one-dimensional left/right flow of written text (or, up/down in Asian langs).

If we can rewrite convention, i'd say we all just use simple left/right pairs such as ()[]{}. Since these already have a purpose, then we could use ‹›«»〈〉《》【】〖〗. The French quotation marks « » ‹ › is actually the most sensible here among western langs. (though, other countries using French quotation mark also reverse direction or use the same glyph for both opening and ending. This is idiocy gone berserk.)

But since we cannot restart history nor do we want to break convention radically because we'd create confusion, what i do today personally of writings published on my website, is to use the most ubiquitous convention, the American convention “like this”. (I experimented in using the French convention of «like this», but that turns out to be too in-your-face for English readers)

Problem of Non-matching Straight Quotation Marks

It is unfortunate, thru the historical development of the typewriter and the computer keyboard and ASCII, that our keyboard doesn't have the proper matching curly quotes, but instead, has the straight quotes. Here's the symbols and their Unicode name:

This creates a problem because it forces us to use the same symbol for a purpose that naturally calls for a matching pair. Using a single symbol is harder to read. Further, it causes global damage when one is missing (e.g. caused by typo or transmission error).

It would've been better, if the typewriter was designed with a matching single curly quote. That is, have keys instead of ' ". This way, we get a matching pair, and we can also emulate the double curly quote by repeating the single one.

Because of the ambiguity problem of straight quotes, many tech writing in software follows a convention by using the backtick ` for the opening and the straight quote ' for the closing mark. (`like this' and ``repeated for double'') I think this convention started or is popularized by the TeX typesetting system, because that's the markup used to typeset curly quotes.

In particular, `this style' is adopted by the Free Software Foundation in their GNU Project .

Although this workaround solves a syntactical ambiguity problem, i think it is rather unnecessary and ugly. For a workaround with the constraint of ASCII for a matching quote, i would have adopted something more symmetric such as ('this') maybe or {'this'} or -'this'-. But the problem with the GNU is that even today, in 2007, where curly quotes have been widely available in word processors for over a decade (and Unicode have been practical and widely available for at least 5 years. [see Unicode Popularity on Web by Google] ), they are still using plain ASCII hacks. (in general, GNU and the Open Source morons have like a 5 to 10 years lag in adopting technology, for reasons that are inadvertently intentional and or simply incapable)

The Moronicity of Omission of Ending Quotation Symbol in Long Paragraphs

There is a very stupid convention used in novel printing. In novels, often a whole page consists of dialogs. Each paragraph is part of a dialog. Logically, each paragraph then should be enclosed in matching quotes, and if there are a series of such paragraphs, each and every should be enclosed in matching quotes.

However, this is not done because it is considered repetitious. The typography convention is to not use any ending quote, if the quoted text is of paragraph length. So, we'd have a series of paragraphs that all starts with a opening curly quotes, but are never closed.

This is another moronicity of the typographers. Such irregular tampering starts to show its problems in the computing era. Generally speaking, it makes it difficult to process the text and creates ambiguities, both for human and for machine.

Apostrophe in English: Curly vs Straight

Another moronity of our topic, is the choice of glyph for apostrophe as a punctuation in English writing. For example «I've», «O'clock», «Mary's». This is a rather big subject to tackle, dragging in the bag of grammarians and stylists and their guilds and guides and rules and exceptions about the position of the apostrophe and possible omission of the “s” (e.g. «James'»), but i'll just focus on the typographical aspect of whether to use the 'straight' quote or the ‘curly’ one in «I've», «O'clock», «Mary's».

Normally, people use the straight version because the curly one isn't available on the keyboard. However, the curly version is used in word processor and in printing.

We should not use the curly version for the apostrophe. Because, the single curly quote already has a logical and conventional semantics. It is used as a matching pair for quotes.

By using the same character for both apostrophe and closing quote, it confounds the meaning, increase the cost of computation (e.g. «… ‘O’Clock’ …»). But also, the semantics of apostrophe as a punctuation symbol in no way calls for a curly glyph.

What people really want is a slanted glyph, like this . We want a slanted apostrophe because that's how we write it by hand. We write it by hand slanted, because that's easier, because most people are right handed, and a vertically straight one is too easy to be confused with I or 1.

However, the proper slanted version of apostrophe, the unicode char named “Prime” , is not popularly available, while most word processors today do have the ‘curly single quote’. This is why, in print or on-screen, curly one became the convention for apostrophe.

The gist of this is that if we want to demarcate text, the symbol used should be a matching pair to indicate the opening and closing position. If the purpose of a punctuation symbol does not require a matching pair such as for opening/closing, then we should not be using a matching pair. Apostrophe is a punctuation that does not call for a matching pair. Further, preferably, each symbol should not be used for more than one purpose.