We moved from London to Birmingham a couple of years ago now, and one of the first things I noticed when we arrived were the street signs: extravagant, cast-iron behemoths far removed from London’s restrained licence plates for buildings. Above is a typical street sign in Edgbaston, our then-new neighbourhood; below is an old-style enamelled sign from Wandsworth, our previous one.
Granted, Birmingham’s modern street signs, as used in much of the rest of the city, are significantly less interesting than the black-and-white battleship above, but then the same is also true of London. Birmingham once had standards to maintain; London didn’t.
Anyway, back to the two signs above: useful, legible both. But only the Brummie sign packs in an abbreviation, a tilde and two commas, all while bellowing “God save Queen Victoria!!1!!111” with foam-flecked lips, and for that it is my pick for the coveted Best Street Sign I Have Seen in the Past Two Years award.
In the midst of the ongoing SARS-CoV-2 pandemic, Twitter user @talkporty* got in touch to ask:
Dear @shadychars, you are the only one I can turn to in this situation. I am being hounded by EM-DASHES! Help! How has #covidー19uk become a trending hashtag? Nobody types them in.
And crazier still: paste it in and it seems it is [the] chōonpu symbol instead!1
It isn’t often that international health crises and punctuation intersect, but these are the times in which we find ourselves.
First things first: I tapped on the #covidー19uk hashtag to discover that is indeed it a valid hashtag, and also that a lot of people are using it. Next, I did some copy-and-pasting of my own to confirm that “ー”, a character that looks very much an em dash, is not, in fact, an em dash. As @TalkPorty said, it is the chōonpu — AKA the “long sound symbol”,2 AKA Unicode’s KATAKANA-HIRAGANA PROLONGED SOUND MARK3 — a dash-like mark used to indicate long vowels in Japan’s katakana and hiragana syllabaries.
How did a dash-like non-dash end up in one of the most common hashtags at a time of global crisis?
At first, I assumed that it must have been a cut-and-paste error. As @TalkPorty suggested, rare is the person who knows how to enter an em dash on the computer’s keyboard, and I wondered if the hashtag’s original creator had perhaps browsed a list of Unicode characters until they found a likely-looking candidate. But that didn’t seem entirely plausible: you have to stray pretty far from Unicode’s Latin alphabet and its accompanying marks before you reach katakana and its punctuation. The idea that this was accidental, or coincidental, didn’t quite fit.
Next, I wondered if it could have been a typo caused by a smartphone’s software keyboard. Perhaps a Twitter user hunting for an em dash alighted on a visually similar mark by mistake. Probably not, I thought, for the same reason as before: if you have your phone set to display a QWERTY keyboard for a Western alphabet, you almost certainly won’t have ready access to the chōonpu. It takes a deliberate effort to switch languages and go hunting to find one.
In summary, someone must have chosen this character deliberately, though I was none the wiser as to why they had done so. In the end, I blundered into what I thought was a plausible solution. Here are my original tweeted replies to @TalkPorty:
Well, that’s weird. I can only imagine that some cut-and-paste or soft keyboard error has gone viral (sorry) along with the hashtag.
If I tap on the hashtag in Twitter’s Android app, I’m given the option to compose a tweet containing that same hashtag. I’d imagine that’s how it’s spreading. #covid-19uk (with a hyphen-minus) is also doing fairly well.
Wait! I lie. #covid-19uk isn’t a valid hashtag! Presumably someone has figured out that KATAKANA-HIRAGANA PROLONGED SOUND MARK can be used to “hyphenate” rather than break apart a hashtag. Very clever.
In other words, it seemed very much as if some savvy tweeter had used the chōonpu — a character that looks like a dash but works more like a letter — to construct a hyphenated term that sneaked past Twitter’s rules for valid hashtags. I left it at that.
As I was writing this post, though, I couldn’t help but wonder why the chōonpu in particular had been used. I’m not a Unicode expert, but it seemed unlikely that there was only one dash-like character among its 143,000 code points that could have been used to pull off this piece of hashtag hacking. Why did this Japanese mark end up in a hashtag otherwise comprised of Latin characters?
Now, some Japanese computer keyboards have a QWERTY layout, where roman letters are mapped to Japanese symbols in a system called romaji, and so, out of curiosity, I installed Google’s Japanese keyboard4 on my Android phone and took a look at its romaji mode. Right there, beside the ‘L’, was a chōonpu. Could #covidー19uk have been created by a native Japanese speaker with access to a romaji keyboard?
Well, what do you know? The earliest tweet to use the chōonpu in a #covidー19 hashtag was posted by a Japanese speaker with the Twitter username @spreadnewsxxx. Google’s mostly intelligible translation is as follows:
Indeed, it can handle version upgrades (mutations) the following year. Is it normal to give a name that seems to follow the disease name?
Twitter reports that @spreadnewsxxx posted their tweet with an iPhone, whose Japanese keyboard I’m not familiar with, and so I can’t know whether they used the chōonpu for convenience or for its aforementioned hashtag friendliness. Either way, we now have a Patient Zero, if you’ll forgive the expression, in the form of the first use of the hashtag that has been plaguing @TalkPorty. The mystery is solved, or at least diminished.
I’d love to know if any readers have encountered the chōonpu in non-Japanese texts. Is this a common usage? Are its Twitter-defying powers commonly known? Drop me a line in the comments below!
Well, ✨that was fun✨❗ For now, though, it’s time to work our way through some of the punctuation-related links and news articles that have cropped up during our stay in emojiland. Stick around; there’s some great stuff to come.
First, via the always fascinating Language Hat, comes word of a paper entitled “Pull out all the stops: Textual analysis via punctuation sequences”.1 In it, Darmon, Bazzi et al ask the question: is it possible to identify individual writers using only their punctuation? That is, if you remove the words from a piece of writing, can you mathematically fingerprint the writer by the marks that remain? The answer is a firm “sort of”. I’ll leave you to read the full paper to find out more.
Next up, Russell Harper, editor of the Chicago Manual of Style’s “Shop Talk” blog, delves into breaks in fiction. You know the ones I mean — more significant than a new paragraph but less significant than a new chapter, and typically separated by blank lines, asterisms (* * *), bullets (• • •), or other typographical flourishes. Russell investigates the means by which various authors (or the typographers who designed their books, or perhaps both in concert) chose to set off breaks in their novels with varying degrees of success.
On the subject of significant breaks, I am extremely late in bringing to your attention an intriguing tweet published by Katie Henry back in 2018, but it is too good not to share:
If you ever feel self-conscious about your writing, please know that in 1802, a man named Timothy Dexter published a 9,000-word book with seemingly arbitrary capitalization and literally ZERO punctuation.2
And it gets better. The book in question is called A pickle for the knowing ones, or, Plain truths in a homespun dress,3 and it was self-published by the aforementioned Timothy Dexter as a gift for his friends.4,5 The lucky recipients must have been bamboozled by its contents: Dexter used unpredictable capitalisation and a spelling system of his own invention. What he did not use was punctuation: neither a comma nor a full stop was there to interrupt his stream of thought.5 Here, for reference, is the opening passage, cut off at what seems to the end of a sentence:
Ime the first Lord in the younited States of A mericary Now of Newburyport it is the voise of the peopel and I cant Help it and so Let it goue Now as I must be Lord there will foller many more Lords pretty soune for it dont hurt A Cat Nor the mouse Nor the son Nor the water Nor the Eare then goue on all is Easey Now bons broaken all is well all in Love Now I be gin to Lay the corner ston and the kee ston with grat Remembrence of my father Jorge Washington the grate herow 17 sentreys past before we found so good a father to his shildren and Now gone to Rest6
Well, alright then. As Randy Nelson explains in The Almanac of American Letters, “Literary historians have never been able to decide whether this little book is hoax, lunacy or avant-garde.”5 Certainly, Dexter might have harboured any or all of those motivations. Born a commoner, Dexter styled himself a lord and made a fortune through a series of improbable business deals — among others, he exported coal to the mining mecca of Newcastle during a miners’ strike, and repurposed bed-warming pans as cooking pots for molasses and sold them in the Caribbean. With his money he established a retinue comprising a fortune-teller, a simpleton and a pornographer, and housed them in a mansion surrounded by wooden statues of notable figures. Himself included, of course.4
For the second edition of Pickle (eight would be published), Dexter acknowledged that some readers might have had trouble with his lack of punctuation. Accordingly, he wrote,
the Nowing ones complane of my book the fust edition had no stops I put in A Nuf here and thay may peper and solt it as they plese
Here is that very same peper and solt as it appeared in the fourth edition:
Lastly, reader John Chulick brought to my attention Adam O’Fallon Price’s essay on the em dash, or ‘—’, published in 2018 at The Millions.7* Price explores em dashes in the works of Vladimir Nabokov, Emily Dickinson and others, and I’ll leave it to him to explain why the mark is worth our attention:
For me, there is no punctuation mark as versatile and appealing as the em dash. I love the em dash in a way that is difficult to explain, which is, probably, the motivation of this essay. And my love for it is emphasized by the fact that many writers never, or rarely, use it — even disdain it. It is not, so to speak, an essential punctuation mark, the same way commas or periods are essential. You can get along without it and most people do. I don’t remember being taught to use it in elementary, middle, or high school English classes; I’m not even sure I was aware of it then, and I have no clear recollection of when or why I began to rely on it, yet it has become an indispensable component of my writing.7
On a related note, Stan Carey, another must-follow at Sentence First and Strong Language, also wrote about dashes in 2018. Stan focused on the em dash’s abbreviated sibling, the en dash, and specifically its use to hyphenate compound terms. I remember being tickled at the concept upon learning of it in the Chicago Manual (it is as subtle as it is useful), and Stan more than does it justice at Sentence First. Well worth a read!
A couple of months ago, in the midst of writing my emoji series, I took some time out to have a chat with Glenn Fleishman for his new podcast series, the Tiny Typecast. Glenn is an old friend of the blog and is astonishingly well-informed about books, typography and all things related: we talked about books and book history for what felt like a few minutes, but turned out to be the better part of an hour. Glenn is easy to talk to and, if you check out our conversation on Apple Podcasts or at Glenn’s blog, you’ll find that he’s easy to listen to, too. (The jury is still out on yours truly.)