Xah Lee, 2007-10
[this article discusses some issues in typography, especially those related to the dash and quotation marks]
Today's Wikipedia readings... some excursion into typography:
I've had some interest in typography since early 1990s of the Mac's Desktop publishing↗ era. Basically, i avidly scanned books about fontography in libraries or Mac magazines such as Mac User or Mac World, and played with fonts and math typesetting in software such as Microsoft World and Mathematica↗, including reading Knuth's book on typography and using his TeX↗ system, reading about font technology such as TrueType↗ . So, i am generally acquainted with the concepts and issues of typography, though never worked in any professional area related to it.
I'll have to say, the entire typographical efforts and establishment is rather largely a waste of time, similar in the sense that some “artistic” circles chalks up photography as high art↗, or that grammarians and pedants have voluminous and vociferous writing style guides and guilds.
Some of the most fartful things the typography-sensitive crowd discuss or distinguish are: hyphen, en-dash, em-dash, ligature↗, kerning, font “design”.
In general, the function of typography is mainly about issues in printing with respect to the facilitation of reading. So, the major issues involved are: line length, line spacing, serif and sans serif fonts, margin, font sizes, and these pretty much are about it. But since how things are rendered on paper does create differences in the sense of esthetics, sometimes rather pronounced difference, thus typography does indeed have some esthetical elements. However, this is blown out of proportion to stupendous profundity.
Look at these guilded morons go:
«Traditionally an em dash—like so—or spaced em dash — like so — has been used for a dash in running text. The Elements of Typographic Style recommends the more concise spaced en dash – like so – and argues that the length and visual magnitude of an em dash "belongs to the padded and corseted aesthetic of Victorian typography". The spaced en dash is also the house style for certain major publishers (Penguin, Cambridge University Press, and Routledge among them). However, some longstanding typographical guides such as The Chicago Manual of Style still recommend unspaced em dashes for this purpose. The Oxford Guide to Style (2002, section 5.10.10) acknowledges that ...» —excerpt from Wikipedia Dash↗
Here's my own rule regarding the use of dash: There are 2 kinds: the short dash and the long dash. For the short one, press the “-” key on your keyboard. For the long one — as a punctuation mark for embedded thought — press it twice. That's it. Simple and functional. (personally, in my writings published on my site, i replace the double dash by a em-dash “—” only because it is prettier, but don't consider it important)
The character “-” you type on your keyboard is the ASCII↗ 45. The character is named in ASCII as “hyphen”, and is re-named in Unicode↗ as “hyphen-minus” (because unicode has now proper code points for hyphen, figure-dash, en-dash, em-dash, (math) minus, and quite a few other).
As to the typographer's senses and sensibilities about how figure-dash should be used for numbers and en-dash is used for ranges and em-dash is for punctuation and hyphen is for word-breaking ... etc, i regard them pretty much all as trifles produced by morons who's brain is inadequate to sense or tackle the depth of logic and mathematics of languages and structures but fell into a niche of diddling and went on to procure their efforts to heighten themselfs among human animals.
As for hyphen, as in “breaking a word for words near the margin”, my general advice is to abolish such practice. But what to do in a narrow column of text? My general advice is to abolish the practice of layout using very narrow columns. A related concept here is typographical Justification↗. My general advice here is to abolish the practice of justification entirely. (leave it jagged at one end; actually as esthetically superior (and factually functionally superior with regards to reading-facilitation))
The typographic conventions of ligatures (as in adjoining certain letter combinations such as “fi”) should also be abolished.
Related here is the quotation mark. If you read Wikipedia, you'll see that there are huge variations:
Here's some sample characters used for quotations and their unicode names.
Here's a list of conventions of using the double curly quotes:
Quite bizarre.
For some languages, such as Chinese, it is rational how it developed into using symbols (such as 〖〗『』《》) that are different from European languages's curly quotation marks. However, among european langs, there are extreme diversity in using the curly quotation marks. Even the American and English reverse the purpose of the single and double quotes. Some lang reverses the semantics of the left/right pair, some lang positions the mark at the baseline of the font instead of the upper, same lang places them in the opposite corners (as opposed to both upper), some lang uses the same symbol to enclose the quoted text.
One thing interesting about the curly double quotation mark pair is that the two symbols are not bilateral symmetric, but is rotational symmetric. That is, if you rotate the left one 180 degrees, you get the right one. Most other matching pairs ()[]{}«»〖〗《》are bilateral symmetric. The fact that the curly quotes are rotational symmetry and not bilateral, must have contributed significantly the weird diversity in their role as the choice in the opening/closing mark and whether to position them level or facing corners. (Note: similar can be said to the single curly quote ‘’. Note that the Chinese ones「」『』 are also 2-fold rotational symmetric only, however, their box-corner shape intuitively and uniquely defines their placement.)
The left quotation mark's sharp point points to the upper right. The glyph can be mirrored in a vertical axis and or horizontal axis to create the matching variation, a total of 4 possibilities (think of p q b d).
In unicode, i couldn't find one that is pointing to up-left.
This is somewhat curious.
I created one with image here just for the illustration:
The mark can be placed on the upper baseline of the text (as in English convention) or lower baseline (as in the beginning quotation mark in German convention), a total of 2 possibilities.
Given the 4 orientation of a quotation glyph, any one can be chosen as the opening pair, and any can be chosen as the closing pair. So, here the assigned semantics has 4x4 = 16 possibilities.
All things considered, the opening mark can have 4 orientations and 2 positions, a total of 8 possibilities, and another 8 for the closing one, so the total possible quotation mark placement using the double curly quote glyph is 8x8=64.
It is a good thing that this hasn't been exploited.
The function of quotation marks is to demarcate text, and as such delimiters, it should be a matching pair such as ()[]{}, and it should have no more than a bilateral symmetry to reflect the natural one-dimensional (left and right) of written text (or, up/down in Asian langs).
If we can rewrite convention or restart history, i'd say we all just use simple left/right pairs such as ()[]{}<>. Since these already have a purpose, then we could use ‹›«»〈〉《》. But since we cannot restart history nor do we want to break convention radically because we'd create confusion, what i do today personally of writings published on my website, is to use the most ubiquitous convention “” and ‘’, and also «» and ‹› on occasion. (for example, when nesting is more than 2 levels deep, or when in situations starting with «» makes it clearer: e.g. The single curly quote are these: «‘» and «’».)
It is unfortunate, thru the historical development of the typewriter and the computer keyboard and ascii, that our keyboard don't have the proper matching curly quotes, but instead, has the straight quotes. Here's the symbols and their given unicode name:
This creates a problem because it forces us to use the same symbol for a purpose that naturally calls for 2 matching symbols that act as a bracket. For example, when a "text" or "code" used in "computers" has "lots" quotation marks, it's harder to tell which part is quoted. Further, it causes non-local damage with a missing quote (as typo).
It would be better, if the typewriter was designed with a matching single curly quote ‘’. This way, the matching property is solved and double quotes can be created by typing twice.
A lot documents in the computing world sticks with a convention by using the back tick (ascii 96) ` for left curly single quote and the ascii 39 ' for the right single curly quote, and repeat them for the double version. So, it's like ``this'' and `this'. In particular, this style is used by the Free Software Foundation in their GNU Project↗.
Although this workaround solves a semantic problem in a technical writing context, i think it is rather unnecessary and ugly. For a workaround with the constraint of ascii for a matching quote, i would have adopted something more symmetric such as ('this') maybe or simply just {this}. But the problem with the GNU is that even today, in 2007, where curly quotes have been widely available in word processors for over a decade (and unicode have been practical and widely available for at least 5 years), they are still using plain ascii hacks. (in general, GNU and the “OpenSource” morons have like a 5 to 10 years lag in adopting technology, for reasons that are inadvertently intentional and or simply incapable)
There is a very stupid convention used in novel printing. Often, a long paragraph is entirely a character's dialogue. So, logically, the whole paragraph would be enclosed in matching quotes, and if there are a series of such paragraphs, each and every should be enclosed in matching quotes. However, this is not done because it is considered repetitious. The typography convention is to not use any ending quote, if the quoted text is long. So, we'd have a series of paragraphs that all starts with a opening quote, but is never closed.
This is another moronicity of the typographers. Such irregular tampering starts to show its problems in the computing era. Generally speaking, it makes it difficult to process the text and creates ambiguities, both for human and for machine.
Another moronity in our subject, is the use of apostrophe as a punctuation in English writing. For example «I'd», «he's», «james'». This is a rather big subject to tackle dragging in the bag of grammarians and stylists and their guilds and guides and rules and exceptions, but i'll just focus on the typographical aspect of whether to use the straight single quote or the curly one «I’d», «he’s», «james’».
Typically, the issue is that people were using the straight version because the curly one isn't available. However, in my opinion, we should not use the curly version (right single quotation mark, unicode U+2019) for the apostrophe. The reason is that we should not use a symbol from a matching pair for a purpose that does not call for a matching pair. (the curly version is the right matching pair used for nested quoting) Such use creates ambiguity, e.g.: «“i said: ‘he’s’.”».
I do not imply that we'd forbid in using any symbol that does not have a bilateral symmetry for purposes that does not need a matching pair.
Related essays:
Page created: 2007-10. © 2007 by Xah Lee.