The 2015 Shady Characters gift guide

It’s December, and that means it’s time for the second annual Shady Characters gift guide! In no particular order, here are a few gifts to consider for the punctation-phile or language buff in your life.


Cover of Making a Point by David Crystal
Image courtesy of Profile Books.

Last year I focused on mainly non-literary gifts; this year, happily, has seen the publication of a number of new books on punctuation. Here’s the first: David Crystal’s Making a Point: The Pernickety Story of English Punctuation is a combined history and usage guide that explores punctuation in English from medieval monasteries to the internet. I reviewed it for the Wall Street Journal and had a great time in doing so — the first part in particular, in which Crystal takes the reader on a breakneck journey through the history of English punctuation, is a joy to read. More serious than Shady Characters and less judgmental than Eats, Shoots, and Leaves, it’d make a great gift for writers, readers, and teachers.


Recently, I came across an article at CBC.ca about an “Ampersand Distilling Company” on Vancouver Island, British Columbia. Punctuation and booze, and in the Pacific Northwest, no less? I was intrigued.

A bottle of Ampersand gin
“A better gin. Period.” I see what you did there. (Image courtesy of Ampersand Distilling and Centric Photography.)

Ampersand, it turns out, is a family business run by Stephen and Jeremy Schacht, along with Jessica McLeod, which makes gin and vodka from organic ingredients harvested on the Schacht’s five-acre farm. Jessica told me more about their choice of name and emblem:

We chose the ampersand as our name and symbol because I have read that names joined with an ampersand signify a closer collaboration.* As a family business that seemed very appropriate. It’s also about bringing things together – science & art, innovation & tradition, ingredients & techniques. Also, gin being our flagship offering, we thought it worked since gin is very much and ‘&’ spirit. Gin & tonic, gin & vermouth — even just the idea of botanicals & the spirit.

The company’s latest product is called Per Se Vodka — named, of course, for the origins of the ampersand’s name in the expression “and, per se, and”. A distiller after my own heart. Ampersand’s spirits are currently available only in British Columbia, but if you’d like to learn more you can stop by the Ampersand Distilling website. A great gift for drinkers and thinkers — if you’re lucky enough to live in BC, that is.


On to the next book, and it is one that comes with a Not Safe for Work warning: Fucking Apostrophes – A Guide To Show You Where To Stick Them, written by Simon Griffin, was published recently to some acclaim from The Guardian.

Cover of "Fucking Apostrophes"
Delightful! (Image courtesy of Simon Griffin.)

As Simon explains, “I’m a freelance copywriter and based on the insight (slash massive generalisation) that Creative Directors are generally good at swearing and bad at apostrophe usage, this was the result.” And what a lovely result it is!

Here are Simon’s guides to using the apostrophe in plural names:

The same rule [as for possessive apostrophes] applies for family names, which is where many mistakes are made. Where you put the fucking apostrophe depends on how many family members you’re talking about.

Examples:

The Biebers’ behaviour was upsetting = The behaviour of the whole Bieber family was upsetting. (Plural)

Mr Bieber’s behaviour was particularly distressing = The behaviour of Mr Bieber was particularly distressing. (Singular.)

The first print run has already sold out, but Simon tells me that a second run of 400 copies is now available. Get your copy — I mean, a copy for the object of your gifting affections — here while you can.


Books, though — all that text is too small, am I right? And not fleshy enough, either. If you feel the same, then temporary tattoo store Tattly.com has you covered with their “Air Quotes” design, as created by Jonny Gotham.

Tattly's "Air Quotes", designed by Jonny Gotham
Tattly’s “Air Quotes”, designed by Jonny Gotham. (Image courtesy of Tattly.com.)

Lastly, after that brief excursion beyond the printed page, it’s back to books. Glyph, written by Adriana Caneva and Shiro Nishimoto of London design studio Off-White, is an easy-to-digest stroll through a whole host of non-alphabetic characters. It’s a slim little thing — each character gets a single page of text accompanied by a page or two of images, and much of the material will be familiar to Shady Characters readers — but it would be perfect as a stocking filler for a budding typophile or lover of punctuation.

Of course, if you aren’t already familiar with the interrobang, ampersand, irony mark et al, I have the perfect solution for you: Shady Characters, the book, is still available in hardback and paperback, or for the e-reader of your choice, from a host of bookshops. Why not buy a copy as a companion to one of the other gifts here?


That’s all from me for 2015 — as ever, thank you all for your comments, emails, tweets and Facebook messages; enjoy the rest of the year, and see you all in 2016!

1.
Writers Guild of America. “FAQs”. Accessed December 9, 2015.

 

*
This is very true: as I learned recently, an ampersand in the credits for a film means that two or more writers co-wrote a script, while the word “and” indicates a looser relationship such as a re-write or subsequent revisions by a separate writer.1 

Miscellany № 67: irony’s restoration

We first met the Right Reverend John Wilkins FRS, renaissance man of the Restoration, back in 2011. A founding member of the Royal Society, brother in law to Oliver Cromwell and mad scientist extraordinaire, Wilkins was one of the seventeenth century’s most ardent devotees of what are now called conlangs, or constructed languages, and he expended a considerable amount of time and effort on his magnum opus on the subject, An Essay towards a Real Character and a Philosophical Language.1 His book was published to acclaim in scholarly circles though it very nearly never made it to print at all, as Wilkins himself explained in his introduction:

I have been the longer about it [the writing], partly because it required some considerable time to reduce the Collections I had by me to this purpose, into a tolerable order; and partly because when this work was done in Writing, and the Impression of it well nigh finished, it hapned (amongst many better things) to be burnt in the late dreadfull Fire; by which, all that was Printed (excepting only two Copies) and a great parts of the unprinted Original was destroyed: The repairing of which, hath taken up the greatest part of my time ever since.

The “late dreadfull Fire” was, of course, the Great Fire of London that in 1666 had destroyed some 13,000 homes in the City of London, Wilkins’ own vicarage among them.23 Having gathered his wits and his notes in the wake of the fire, Essay was finally published two years later.

His readers, by and large, found it to have been worth the wait.

The first and largest part of Wilkins’ ponderous tome was a taxonomy of, well, of everything, a kind of Dewey Decimal System for classifying “all things and notions that fall under discourse”, as Wilkins put it. Following this were the “real character” and “philosophical language” of the title — an alphabet of written symbols and a vocabulary of spoken sounds, respectively, with which readers could communicate the “things and notions” that they succeeded in categorising according to the taxonomy itself.4 The sum total of all this was a finished artificial language: rules for locating things and ideas within a taxonomical framework; a written script to set those concepts down on paper; and a spoken language to give them voice. Wilkins finally united Essay’s three components in a concrete example on the four hundred and twenty-first page of his book, in the form of the Lord’s prayer. Here it is:

The Lord's prayer, as translated by John Wilkins into his "philosophical language"
The Lord’s prayer, as translated by John Wilkins into his “philosophical language”. (An Essay Towards a Real Character and a Philosophical Language, page 421. Image courtesy of Google Books.)

If you’ve read the Shady Characters book, however, or my first post about irony marks, you’ll already know that there was more to Essay than an elaborate constructed language. Tacked onto the tail end of Wilkins’ description of his “real character” was a clutch of punctuation marks with no less than an irony mark among them, the oldest one I’ve yet found, and one that has echoed down the centuries until today. Until now, I’d only ever known about Wilkins’ mark in an abstract way, via the words or allusions of other writers, but a recent tweet by Coffee & Donatus inspired me to look again at this earliest of irony mark. And with Coffee & Donatus’s help, now, happily, I can bring you Wilkins’ irony mark in the words of the man himself.

Let’s dive right in. The following image is a detail of page 393 from Essay on which Wilkins lists all of the marks he felt were necessary to punctuate texts written using his invented script. In addition to the humdrum comma, colon and period (not shown here), he branched out with a double-decker hyphen, parentheses, “explication” brackets used to elucidate texts, a question mark, an exclamation, or “wonder” mark, and, finally, an inverted exclamation mark serving as an irony mark:

John Wilkins' "other Notes to distinguish the various manners of Pronuntiation"
John Wilkins’ “other Notes to distinguish the various manners of Pronunciation”. (An Essay Towards a Real Character and a Philosophical Language, page 393. Image courtesy of Coffee & Donatus.)

Wilkins wrote of his irony mark:

Irony is for the distinction of the meaning and intention of any words, when they are understood by way of Sarcasm or scoff, or in a contrary sense to that which they naturally signifie: And though there be not (for ought I know) any note designed for this in any of the Instituted Languages, yet that is from their deficiency and imperfection: For if the chief force of Ironies do consist in Pronunciation, it will plainly follow, that there ought to be some mark for direction, when are to be so pronounced.

Well put. The next time someone asks me to define irony, I’ll tell them that it is any use of words when they are understood by way of sarcasm or scoff, or in a contrary sense to that which they naturally signify. Here are Wilkins’ usage guidelines in situ on page 356, along with his descriptions of the other marks:

Wilkins' description of the punctuation marks to be used with his "real characters"
Wilkins’ description of the punctuation marks to be used with his “real characters”. (An Essay Towards a Real Character and a Philosophical Language, page 356. Image courtesy of Coffee & Donatus.)

Unfortunately, despite the scholarly approval that greeted Essay on its publication, Wilkins’ masterwork followed the philosophical language movement in general on an inexorable downward slope. A century after its publication the fashion for constructing languages had largely waned, and Wilkins’ irony mark had been similarly forgotten. And yet, it was not lost. As we saw last year on Shady Characters, a 1792 book entitled A clear and practical system of punctuation by one Joseph Robertson5 recapitulated the case for an irony mark in exactly the same form, even if Robertson did not credit Wilkins for the invention. Then, scarcely more than a decade ago, Josh Greenman of Slate proposed again that ‘¡’ should be used to punctuate ironic statements.6 Wilkins and his book may be long gone, but his irony mark has usefully outlived them both.


Many thanks to Coffee & Donatus for the images in this post. If you enjoyed the images here, take a look back at Miscellany № 64: let’s gnomonise, which features more from Coffee & Donatus.

1.
Wilkins, John. An Essay towards a Real Character, and a Philosophical Language. Printed for S. Gellibrand [etc.], 1668.

 

2.
Wright Henderson, P. The Life and Times of John Wilkins. Edinburgh, London: W. Blackwood and Sons, 1910.

 

3.
Encyclopaedia Britannica. “Great Fire of London”.

 

4.
Lewis, R. “The Publication of John Wilkins’s Essay (1668): Some Contextual Considerations”. Notes and Records of the Royal Society of London 56, no. 2 (2002): 133-146.

 

5.
Robertson, J. A Clear and Practical System of Punctuation : Abridged from Robertson’s Essay on Punctuation : For the Use of Schools. Boston: I. Thomas and E.T. Andrews, 1792.

 

6.

 

Shady Characters at the BBC: punctuation that failed to make its mark

An ironieteken, courtesy of Olivia Howitt.
An ironieteken, courtesy of Olivia Howitt.

I had the pleasure, recently, of writing another article for BBC Culture. It’s called “Punctuation that failed to make its mark” and it’s a sort of Shady Characters greatest hits, a compilation of a few of my favourite marks that tried valiantly but unsuccessfully to achieve widespread acceptance. There’s Martin K. Speckter’s evergreen interrobang, or ‘‽’, intended to punctuate an excited or rhetorical question; Bas Jacob’s clever but ill-fated ironiteken, or irony mark, as shown above; and the excellent quasiquote (), or paraphrasing mark, first sent in to Shady Characters back in 2014 by the late Ned Brooks.

Like the first article, it was fun to write; also like the first article, it was often more difficult to choose what to take out rather than what to leave in. Have a read and, as ever, let me know what you think of it!


Update: the article is now available in Spanish at La Nacion.

Miscellany № 66: catching up

Things have been frantic around here lately. Mostly, I’ve been busy reviewing the proofs of The Book, of which more soon, but I’ve also written a pair of articles for other publications, both of which were a lot of fun to address.

First up is this review for the Wall Street Journal of David Crystal’s new book, Making a Point: The Per(s)nickety Story of English Punctuation. (The optional s appears on the American edition.) David’s book combines a brief but entertaining history of punctuation in English with a series of short, pragmatic chapters on modern usage. He sticks with the standard marks — the comma, colon, full stop et al — but his anecdotes and asides make this a lively little book. Not as preachy as Eats, Shoots, and Leaves and, dare I say, playing a straighter bat than Shady Characters, I’d heartily recommend it to readers here.*

My second article is this one, published recently at Mental Floss: The Evolution of Punctuating Paragraphs Through 5 Specific Markers. It is (spoiler!) a survey of paragraph marks and typography through the ages, starting with the paragraphos and ending up at the Internet’s favourite paragraph style, the blank line, taking in the pilcrow and illuminated manuscripts along the way. Have a read, and feel free to leave any comments or questions here.


Separately, I was happy to come across an article in The Guardian about the arrival of the ellipsis in English literature. It quotes Dr Anne Toner, who I met a few months back at Punctuation in Practice, on how this odd little practice made its way from factual books into fiction. The first ellipses in fiction (in drama, in fact) were created in the 16th century using hyphens rather than dots (-‌-‌-‌-), but things moved on rapidly from there:

By the 18th century, said Toner, it “becomes very common in print, and blanking starts to be used as a means of avoiding libel laws”, with series of dots starting to be seen in English works, as well as hyphens and dashes, to mark an ellipsis.

Embraced by writers from Percy Shelley to Virginia Woolf, it was in the novel that the ellipsis “proliferated most spectacularly”, according to Toner. She points to Ford Madox Ford and Joseph Conrad’s use of ellipses more than 400 times in their 1901 novel The Inheritors. Ford said that the writers were aiming to capture “the sort of indefiniteness that is characteristic of all human conversations, and particularly of all English conversations, that are almost always conducted entirely by means of allusions and unfinished sentences”.

You can read the full article here. If that whets your appetite, Anne has just published an entire book about elision in English literature entitled Ellipsis in English Literature: Signs of Omission; and if her talk at Punctuation in Practice is anything to go by, it’ll be a thorough and thought-provoking read.

*
You can read an extract of Making a Point at The Guardian

Logarithmical: Zipf’s Law and the mathematics of punctuation

Let’s try an experiment. If we start with some large body of text — post-war American novels, say, or twentieth-century British newspapers — and count all the occurrences of all the words in those texts, we can put together a fairly accurate list of the most popular words in English. The word “the” would be at the top, followed by “of” and then “and”. With this list of word counts in hand, you could turn to any other similar body of work — British novels or American newspapers, for example — and have a good idea of how often you’d expect to find each of the words on your list. Simple enough.

Next, imagine that you throw away those word counts. You keep only the list of words themselves, ordered from most to least common. You don’t know if “the” occurs twice as often as “of” or a hundred times more than it. It turns out that you can still predict how often you’re likely to encounter a given word: knowing only that “the” is the most common word, “of” is second most common, “and” is third, and so on, it is possible to guess with quite startling accuracy exactly how likely you are to encounter a given word. The mathematical relationship that underpins all this is called Zipf’s Law, named for its discovery in the 1930s by Kingsley Zipf, a professor of German at Harvard,1 and it is very simple indeed. Eric Weisstein’s excellent Mathworld site explains it as follows:

In the English language, the probability of encountering the rth most common word is given roughly by P(r) = 0.1/r for r up to 1000 or so.2

To put some numbers on it, you should encounter the word “the” around every ten words, equating to a probability of 0.1/1 = 0.1; “of” should occur every twenty words or so, from 0.1/2 = 0.05; “and” will appear once every thirty words or thereabouts, from 0.1/3; and so on. This is an instance of what is called an inverse power law, and if you plot these numbers on a logarithmic scale you get a shockingly straight line. Here’s an example of the raw numbers for the fifty most common words in the so-called Brown Corpus, a million-word collection of texts compiled between 1964 and 1979:3

Word counts (blue) in the Brown Corpus, ordered from most to least common. Also shown are the expected word counts according to Zipf's Law (green).
Word counts (blue) in the Brown Corpus, ordered from most to least common. Also shown are the expected word counts according to Zipf’s Law (green). (Image by the author.)

Not bad, I think. I’ve overlaid the expected word counts (in green) as predicted by Zipf’s Law, and it looks fairly convincing. If we make each axis logarithmic rather than linear, we get this:

Word counts (blue) in the Brown Corpus, ordered from most to least common. Also shown are the expected word counts according to Zipf's Law (green). Both x and y axes are logarithmic rather than linear. (Image by the author.)
Word counts (blue) in the Brown Corpus, ordered from most to least common. Also shown are the expected word counts according to Zipf’s Law (green). Both x and y axes are logarithmic rather than linear. (Image by the author.)

Better! The maths behind this are quite involved, but the effect of viewing the data on logarithmic axes is to show the perfectly straight line predicted by Zipf’s Law. Again, our data looks good — not a perfect fit, but our actual word counts conform to the predicted values relatively closely. So far, so good. It look as though Zipf’s Law is in full effect in our million-word test case.

Now the weird thing about Zipf’s Law is that is can be arrived at only by observation. There are no verbs, conjunctions and or definite articles out there in nature, waiting for their physical properties to be discovered; our ancestors made them up as they went along and yet somehow we have constructed a language that adheres uncannily to an abstract mathematical idea. Why should the word “the” occur twice as often as “of”, three times as often as “and”, and so on? No-one really knows.

What is even odder is that inverse power laws crop up again and again in what should, by rights, be entirely random groups of things; Zipf’s Law is to words what Benford’s Law is to digits, and Benford’s Law is absolutely everywhere. The distribution of digits in house numbers, prime numbers, the half-lives of radioactive isotopes, and even the lengths in kilometres of the world’s rivers all follow inverse power laws, with the digit 1 being most prevalent by far and the others falling off behind it. Benford’s Law is so reliable that economists use it to detect fraud: if they don’t see a logarithmic distribution of digits in a given set of accounts, with 1 enthroned at the top, they know that someone has been doctoring the figures.4

My thought, then, was this: does punctuation follow some variant of Zipf’s Law? If we count all the marks of punctuation in some suitably large dataset of English texts, do we see a logarithmic distribution in them? There are many fewer unique punctuation marks than there are words, of course, but then Benford’s Law works quite happily with only ten digits to play with. It’s intriguing to wonder: were the writers and editors who invented the comma, full stop and apostrophe moved by the same inexplicable law that governs baseball statistics, the Dow Jones index and the size of files on your PC? I wrote a computer program to find out.

I started by looking at the Brown Corpus, but given that it contains a paltry million words or so there aren’t all that many punctuation marks to be found. I turned instead to Project Gutenberg, which makes out-of-copyright books available in a variety of formats, and downloaded twelve of the most popular works.* Next, I counted the occurrences of all marks of punctuation and plotted them both as raw numbers and as log-log graphs of their occurrences and rank numbers of those same values. Here’s the equivalent of our first graph, only for marks of punctuation rather than words:

Punctuation mark counts (blue) in a selection of works from Project Gutenberg, ordered from most to least common. Also shown are the projected counts (red). (Image by the author.)
Punctuation mark counts (blue) in a selection of works from Project Gutenberg, ordered from most to least common. Also shown are the projected counts (red). (Image by the author.)

Well then. This looks familiar.

We’ll come to the red line in a moment, but let’s stick with the blue line for now. It represents the number of times that each of the marks of punctuation along the x-axis occurred in my ad hoc Project Gutenberg corpus, with the comma in pole position and the full stop around 50% behind it. There’s a bit of a jump down to the paired quotation mark, but the fact that the quotation mark is up there at all is doubtless to be expected from the dialog-heavy novels that make up the bulk of the works I analysed. The semicolon is is fourth position, likely because my texts are predominantly of the nineteenth century, and the apostrophe follows it in fifth.

Now to the red line. If you remember, Zipf’s Law says that the probability P of encountering a word with ranking r is given by P(r) = 0.1/r. Guessing that there’s a similar distribution for punctuation marks, I played around with a variety of different values for the numerator of the fraction, eventually settling on 0.3 as a reasonable proposition. The red line, then, is my predicted distribution of punctuation marks, as given by the equation P(r) = 0.3/r. Enter Houston’s Law, I guess…? Not great, but not terrible either; a larger corpus and some more sophisticated mathematics would likely produce a better number.

If we play the same trick as above, making both x– and y-axes logarithmic to smooth out the curve, this it what we see:

Punctuation mark counts (blue) in a selection of works from Project Gutenberg, ordered from most to least common. Also shown are the projected counts (red). Both axes are plotted on a logarithmic scale.
Punctuation mark counts (blue) in a selection of works from Project Gutenberg, ordered from most to least common. Also shown are the projected counts (red). Both axes are plotted on a logarithmic scale. (Image by the author.)

The first ten punctuation marks, then, follow a Zipfian distribtion in a quite striking way. The unhelpful behaviour of the last few marks (from ‘*’ to ‘%’) may well be because they’re either logograms or non-standard marks of punctuation; why the colon is under-represented, however, I’m not sure. Even so, this is all rather startling. Punctuation marks are Zipfian to a large degree, just like words; the frequency with which we use them obeys the same eerily ubiquitous inverse power law distribution, and I am none the wiser as to why. If ever there was a time to weigh in, commenters, this is it! What’s going on here, and why?

1.

 

2.
Weisstein, Eric W. “Zipf’s Law”. MathWorld. Accessed October 4, 2015.

 

3.
Francis, Nelson, and Henry Kucera. “Brown Corpus”. The Internet Archive.

 

4.
Weisstein, Eric W. “Benford’s Law”. MathWorld. Accessed October 4, 2015.

 

*
I picked the following works from Project Gutenberg’s list of most downloaded titles:

  1. A Modest Proposal
  2. A Tale of Two Cities
  3. Alice’s Adventures in Wonderland
  4. Frankenstein; Or, The Modern Prometheus
  5. Grimms’ Fairy Tales
  6. Metamorphosis
  7. Moby Dick; Or, The Whale
  8. Pride and Prejudice
  9. The Adventures of Sherlock Holmes
  10. The Adventures of Tom Sawyer
  11. The Count of Monte Cristo
  12. The Picture of Dorian Gray

 

It’s worth noting here that I chose to consider the paired marks — single and double quotation marks, parentheses and so on — as single marks for the purposes of this analysis, but there’s certainly room to look at them as two distinct units.