Bombs, books and generative AI: a blunt and obvious parable

1. Loomings

You will be aware, by now, of the cresting wave of excitement about generative AI. “Artificial intelligence”, in general terms, means any computer program which displays humanlike abilities such as learning, creativity, and reasoning,* while “generative AI” applies to AI systems that can turn textual prompts into works of — well, not art, at any rate, but maybe “content” better captures the quality and quantity of the resultant material. Generative AI can produce text (in the case of, for example, OpenAI’s ChatGPT), images (Midjourney being the poster child here) or even music (like Suno).

Broadly speaking, generative AI works like this. An layered collection of mathematical functions called an artificial neural network is adjusted from its default state — “trained”, in the parlance — by a succession of sample inputs such as texts or images. Once trained, this network can respond to new inputs by performing (and I am over-simplifying here to a criminal degree) a kind of “autocomplete” operation that creates statistically likely responses informed by its training data. As such, the works created by genAI systems are simultaneously novel and derivative: an individual text or image produced by your favoured AI tool may be new, but such outputs cannot escape the gravity of their corresponding training data. For AI, there truly is nothing new under the sun.

Layers within a deep neural network
Input, output, and hidden layers in an artificial neural network. (CC BY-SA 4.0 image courtesy of BrunelloN.)

Despite the mundane underpinnings of most AI tools (“wait, it’s all just data and maths?”), many of them are genuinely good at certain tasks. This is doubly true when a network can be trained on a large enough set of inputs — say, for example, a slice of the billions of publicly-available web pages, a newspaper’s archive, or a well-stocked library. All of the most successful genAI systems have benefitted from opulent training datasets so that, for instance, ChatGPT is an eager, helpful and exceptionally knowledgeable conversationalist. Midjourney and Suno, too, can surprise with their ability to create pictures and songs that are at least halfway convincing as simulacra of human works. From a certain point of view, and perhaps with a dab of Vaseline on the lens, the promise of AI has already been fulfilled.

2. Pitfalls

Yet many avenues of criticism remain open to the generative AI sceptic. Most of them are valid, too.

To start at the beginning, training is a slow and intensive process. A neural network, or “model”, for text generation might have to consume billions of pieces of text to arrive at a usable state. Meta’s “Llama 3” system, for instance, which the company has made publicly available, was trained on 15 trillion unique fragments of text.3 Llama 3.2 comes in different flavours, but its largest version, the so-called “405B” model, contains more than 400 billion individual mathematical parameters.4 One paper estimates that OpenAI have spent at least $40 million training the model behind ChatGPT, and Google has spent $30 million on Gemini, while training of bigger models in the future may run to billions.6

Making use of those trained models is expensive, too. An OECD report on the subject says that a single AI server can consume more energy than a family home, and guzzles water into the bargain: a single conversation with ChatGPT is thought to use up around 500ml of water for power generation and cooling alone.7 When you converse with an AI model, you might as well have poured a glass of water down the drain.

An image generated by Microsoft Designer in response to the prompt “Videogame character that resembles Mario the plumber”.

GenAI’s thirst for training data begets another problem: those enormous training datasets often contain copyrighted works that have been used without their owners’ permission.§ The New York Times and other papers are currently suing Microsoft and OpenAI for copyright infringement, to pick only one prominent example.8 AI companies protest that simply training a model on a copyrighted work cannot infringe its copyright, since the work does not live on as a direct copy in the trained model — except that researchers working for those very same AI companies have debunked that assertion. Employees of Google and DeepMind, among others, have successfully tricked image-generation tools into reproducing some of the images on which they were trained,9 and it isn’t very hard to convince AI tools to blantantly infringe on copyrighted works.

This leads to a related criticism. For the likes of Google to have been ignorant of the training data hiding in their AI models is a sign that it is, at a fundamental level, really difficult to comprehend what is going on inside these things.10 The basic operating principles of a given model will be understood by its developers, but once that model’s billions of numerical parameters have been massaged and tweaked by the passage of trillions of training inputs, there is too much data in play for anyone to truly understand how it is all being used. In my day job I work with medical AI tools, and the doctors who use them worry acutely about this issue. The computer can tell them that a patient may have cancer, but it cannot tell them why it thinks that. The “explainability” of AI still has a long way to go.

Then there are the hallucinations. This evocative term came out of computer vision work in the early 2000s, where it referred to the ability of AI tools to add missing details to images. Then, hallucination was a good thing, since this was what these tools were trying to accomplish in the first place. More recently, however, “hallucination” has come to mean the way that genAI will sometimes exhibit unpredictable behaviour when faced with otherwise reasonable tasks.11 Midjourney and other image generators were, for a long time, prone to giving people extra fingers or limbs.12 An AI tool trialled by McDonald’s put bacon on top of ice cream and mistakenly ordered hundreds of chicken nuggets.13 OpenAI’s speech-to-text system, Whisper, fabricated patients’ medical histories and medications in an alarming number of cases.14 True, these are not hallucinations in the human sense of the word. It’s more accurate to call them ghosts in the probabalistic machine — paths taken through the neural net which reveal that mathematical sense does not always equate to common sense. But whatever they are called, the results can be startling at best and dangerous at worst.

There are also philosophical objections. At issue is this: humans, it turns out, want to work. And not only to work, but to do good, useful work. Back in the nineteenth century, the Luddites worried about skilled workers being dispossessed by steam power.15 The proponents of the fin de siècle Arts and Crafts movement, too, epitomised by the English poet and textile designer William Morris,|| agitated for a world where people made things with their own two hands rather than submit to the stultifying, repetitive labour of the production line. Whether for economic or spiritual reasons, both recognised the value of human craft.

More recently, the anarchist and anthropologist David Graeber expressed similar sentiments in a widely-read 2013 essay, “On the Phenomenon of Bullshit Jobs”. In it, he notes that the twentieth century’s relentless increases in productivity have gifted us not fewer working hours or a boom in human happiness but rather a proliferation of low-paid, aimless jobs, in which workers twiddle their thumbs in make-work roles of whose pointlessness they are acutely aware.16

Neither nor David Graeber nor his historical counterparts were complaining about AI, but the point stands. If corporations view genAI as a means to replace expensive human workers (and to be clear, that is exactly how they see it,17 just like mill owners and car manufacturers before them), then we are in for yet more “bullshit jobs”.

None of these complaints should be all that surprising to us. All computers eat energy and data and turn them into heat and information, so if we force-feed them with gigantic quantities of the former then we will get commensurately more of the latter. And if history has taught us anything, it is that technlogies are tools rather than panaceas, and that work-free utopias are far less durable than, say, exploitative labour laws and unequal distributions of wealth.

3. The bomb pulse

To backtrack to an earlier and more optimistic point, in many cases AI tools can be remarkably good at their jobs. A friend of mine likened the large language models that underpin most text-based generative genAI systems to an intern who has read the entire internet: they will not always make the same deduction or inference that you would have done, but their breadth of knowledge is astounding. It is hard to argue that we have built something incredible here. If we can stomach the cost in power and in water, then genAI promises to not only solve problems for us, but to do so at a ferocious rate — and here is where a new and interesting class of AI problems arise. To see why, let me take you back to the heyday of another morally ambiguous technological marvel.

Nuclear weapons testing, where the likes of the USSR, USA and UK exploded hundreds of atomic bombs in the air and underground, reached a peak in 1961-62 after which a test ban treaty led to a dramatic slowdown.18 But the spike in tests before the ban, which contaminated Pacific atolls and Kazakh steppe alike, sent vast clouds of irradiated particles into the atmosphere that would leave a lasting mark on every living thing on earth.19,20

That mark was the work of a massive rise in the amount of atmospheric carbon-14, a gently radioactive isotope of carbon that filters into human and animal food chains through photosynthesis and the exchange of carbon dioxide between the air and the ocean. Carbon-14 is essential to radiocarbon dating, because it allows researchers to match the amount of carbon-14 in a material (taking into account the rate at which it decays into plain old carbon-12) with a calibration curve that shows how much carbon-14 was floating around in earth’s biosphere at any particular point in the past.21 It is a powerful and useful tool, with everyone from archaeologists to forensic criminologists using it to determine the age of organic matter.

A graph showing the "bomb pulse" of atmospheric carbon-14
The “bomb pulse” of atmospheric carbon-14. (Public domain mage courtesy of Hokanomono.)

The so-called bomb pulse of carbon-14, then, which was measured and quantified to the minutest detail, made the calibration curve much, much more accurate. It became possible to date just about every living organism born after 1965 to an accuracy of just a few years — a dramatic improvement on the pre-atomic testing era, when carbon-14 concentrations ebbed and flowed across centuries and millenia in a confusing and sometimes contradictory manner.22 If a mushroom cloud could ever be said to have had a silver lining, it was the bomb pulse.

Just as nuclear testing pushed carbon-14 through the roof, generative AI is dramatically increasing the quantity of artifically-generated texts and images in circulation on the web. For instance, a recent study found that almost half of all new posts on Medium, the blogging platform, may have been created using genAI.23 Another report estimated that billions of AI-generated pictures are now being created every year.24 We are living through a bomb pulse of AI content.

Unlike its nuclear counterpart, however, the explosion of AI content is making it harder, not easier, to make observations about our world. Because AI apps are incapable of meaningfully transcending their training data, and because a non-trivial proportion of AI outputs can be objectively quantified as gibberish, the rise in AI-generated material is polluting our online data rather than enriching it.

To wit: Robyn Speer, the maintainer of a programming tool that looks up word frequencies in different languages, mothballed his project in September 2024 because, he wrote, “the Web at large is full of slop generated by large language models, written by no one to communicate nothing.”25 Elsewhere, Philip Shapira, a professor at Manchester University in the UK, connected an inexplicable rise in the popularity of the word “delve” with ChatGPT’s tendency to overuse that same word.26,27 There is already enough AI “slop” out there to distort online language.

Indeed, there is now so much publicly available AI-generated data that training datasets are being contaminated by synthetic content. And if that feedback loop occurs too often — if AI models are repeatedly trained on their own outputs, or those from other models — then there is a risk that those models will eventually “collapse”, losing the breadth and depth of knowledge that makes them so powerful. “Model Autophagy Disorder”, as this theoretical pathology is called, is mad cow disease for AI.28 Where herds of cattle were felled by malformed proteins, AI models may be felled by malformed information.

4. Burning books

And so, if I may be permitted (encouraged, even) to bring things to a point, I will submit as concluding evidence a story published by the tech news outlet 404 Media. Emanuel Maiberg’s piece, “Google Books Is Indexing AI-Generated Garbage” reports that, well, Google Books appears to be indexing AI-generated garbage. (Never change, SEO.) Searching for the words “as of my last knowledge update” — a tell, like “delve”, of OpenAI’s ChatGPT service — Maiberg uncovered a number of books bearing ChatGPT’s fingerprints.29 That was in April this year, and Maiberg noted with cautious optimism that the AI books were mostly hidden farther down Google’s search results. The top results, he said, were mostly books about AI.

In writing this piece I tried Maiberg’s experiment for myself. In the six months, give or take, since his article was published, the situation has reversed itself: on the first page of my search results, every entry bar one was partly or wholly the work of ChatGPT. (The lone exception was, as might be hoped, a book on AI.) Most results on the second page also were the products of AI. The genAI bomb pulse would appear to be gathering pace.

I’ll be honest: I had been ambivalent about generative AI until I hit the “🔍” button at Google Books. I’ve tried ChatGPT and Google Gemini as research tools, but despite their breadth of knowledge they sometimes missed details that I felt they should have known. I’m also one of those humans who enjoys doing their own work, so delegating writing or programming tasks to AI has never really been on the cards. If I need help with a problem as I write a book or design a software system, I’ll run a web search or ask a human for help rather than an environmentally-ruinous autocomplete algorithm on steroids. Not everyone has the same qualms, though, or the same needs, and that’s fine. But this AI-powered dilution of Google Books still feels like a step beyond the pale.

For millennia, books were humanity’s greatest information technology. Portable, searchable, self-contained — but more than that, authoritative in some vague but potent way, each one guarded by a phalanx of editors, copyeditors, proofreaders, indexers and designers. Certainly, none of my own books could have come to pass without many other books written before them: I’ve lost count of the facts in them that rely upon some book or another found after a trawl through Google Books, the Internet Archive, or a physical library.

Books have never been perfect, of course. No artefact made by human hands ever could be. They can be banned or burned or ignored. They can go out of date or out of print. They can pander to an editor’s personal whims or a publisher’s commercial constraints, or be axed entirely on a lawyer’s recommendation. Google Books can be criticised too — ironically, it is accused mostly of the same lax approach to copyright as many AI vendors — yet it is an incredibly useful tool, acting as an omniscient meta-index to many of the world’s books. It would take a brave protagonist, I think, to argue that books have not been, on balance, a positive force.

None of this is to criticise AI itself. Costs will decrease as we figure out how to manage larger and larger datasets. Accuracy will increase as we train AI models more smartly and tweak their architectures. Even explainability will improve as we pry open AI models and inspect their internal workings.

Ultimately, the problem lies with us, as it always does. Can we be relied upon to use AI in a responsible way — to avoid poisoning our body of knowledge with the informational equivalent of malformed proteins? To remember that meaningful, creative work is one of the joys of human existence and not some burden to be handed off to a computer? In short, to spend a moment evaluating where AI should be used rather than where it could be used?

What’s left for me at the end of all this is a creeping, somewhat amorphous unease that we are unprepared for the ways in which generative AI will change books and our ability to access them. I don’t think AI will kill books in any real sense, just as ebooks have not killed physical books and Spotify has not killed radio, but unless we can restrain our worst instincts there is a risk that books — and by extension, knowledge — will emerge cheapened and straitened from the widening bomb pulse of generative AI.

1.
Mirzadeh, Iman, Keivan Alizadeh, Hooman Shahrokhi, Oncel Tuzel, Samy Bengio, and Mehrdad Farajtabar. “GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models”. arXiv, October 7, 2024. https://doi.org/10.48550/arXiv.2410.05229.

 

2.
Doshi, Anil R., and Oliver P. Hauser. “Generative AI Enhances Individual Creativity But Reduces the Collective Diversity of Novel Content”. Science Advances 10, no. 28 (July 12, 2024): eadn5290. https://doi.org/10.1126/sciadv.adn5290.

 

3.

 

4.

 

5.
Meta Llama. “Llama 3.2”. Accessed October 25, 2024.

 

6.
Cottier, Ben, Robi Rahman, Loredana Fattorini, Nestor Maslej, and David Owen. “The Rising Costs of Training Frontier AI Models”. arXiv, May 31, 2024. https://doi.org/10.48550/arXiv.2405.21015.

 

7.

 

8.
Robertson, Katie. “8 Daily Newspapers Sue OpenAI and Microsoft Over A.I”. The New York Times, sec. Business.

 

9.
Carlini, Nicholas, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, and Eric Wallace. “Extracting Training Data from Diffusion Models”. arXiv, January 30, 2023. https://doi.org/10.48550/arXiv.2301.13188.

 

10.
Burt, Andrew. “The AI Transparency Paradox”. Harvard Business Review.

 

11.
Maleki, Negar, Balaji Padmanabhan, and Kaushik Dutta. “AI Hallucinations: A Misnomer Worth Clarifying”. arXiv, January 9, 2024. https://doi.org/10.48550/arXiv.2401.06796.

 

12.

 

13.

 

14.
Koenecke, Allison, Anna Seo Gyeong Choi, Katelyn X. Mei, Hilke Schellmann, and Mona Sloane. “Careless Whisper: Speech-to-Text Hallucination Harms”. arXiv, May 3, 2024. https://doi.org/10.48550/arXiv.2402.08021.

 

15.
Magazine, Smithsonian, and Richard Conniff. “What the Luddites Really Fought Against”. Smithsonian Magazine.

 

16.
Graeber, David. “On the Phenomenon of Bullshit Jobs”. STRIKE! Magazine.

 

17.
Manyika, James, and Kevin Sneader. “AI, Automation, and the Future of Work: Ten Things to Solve for”. McKinsey & Company.

 

18.
“Radioactive Fallout”. In Worldwide Effects of Nuclear War.

 

19.
Nuclear Museum. “Marshall Islands”. Accessed October 25, 2024.

 

20.
Kassenova, Togzhan. “The lasting toll of Semipalatinsk’s nuclear testing”. Bulletin of the Atomic Scientists (blog).

 

21.
Hua, Quan. “Radiocarbon Calibration”. Vignette Collection. Accessed October 25, 2024.

 

22.
Hua, Quan. “Radiocarbon: A Chronological Tool for the Recent past”. Quaternary Geochronology, Dating the Recent Past, 4, no. 5 (October 1, 2009): 378-390. https://doi.org/10.1016/j.quageo.2009.03.006.

 

23.
Knibbs, Kate. “AI Slop Is Flooding Medium”. Wired.

 

24.

 

25.
Speer, Robyn. “Wordfreq/SUNSET.Md”. GitHub.

 

26.
Shapira, Philip. “Delving into ‘delve’”. Philip Shapira (blog).

 

27.

 

28.
Nerlich, Brigitte. “From contamination to collapse: On the trail of a new AI metaphor”. Making Science Public (blog).

 

29.

 

*
Although as a new paper demonstrates, generative AI tools are almost certainly not performing any sort of robust logical reasoning. Researchers at Apple found that insignificant changes to input prompts can result in markedly different (and markedly incorrect) responses.1 
Indeed, the research bears this out. Novels written with the help of AI were found to be better, in some senses, than those without — but they were also markedly more similar to one another.2 
That said, there are many smaller models out there. Llama’s own “1B” model can be run on a smartphone.5 
§
I am entirely sure that my writing has been used to train sundry AI models, and if you have ever published anything online or in print, yours likely has been too. 
||
Eric Gill, author of An Essay on Typography, was another leading light. 
An information technology so influential, one might even consider writing a book about it

Miscellany № 103: calculators!

Having dispatched punctuation and book news, we’re on to pocket calculators! Incredibly, half a century or more after their appearance, there is still news to be had on the subject.


In the introduction to Empire of the Sum, I mention that like some other animals, ravens and other corvids are known to be able to count. Not only that, but they understand the concept of zero, which is something that humans struggled with for quite some time.1 Now, though, a study in Science shows that not only can crows count, they can count out loud. In the words of the paper’s authors,

crows can flexibly produce variable numbers of one to four vocalizations in response to arbitrary cues associated with numerical values.

That is, trained crows could caw a number of times that corresponded to a visual symbol representing the same number. Astonishing, no? Our own ability to count led ultimately to writing, math, electronics and calculators. We should check in with our corvid neighbours in a few million years to see how their own evolutionary story is coming along.2,3


The first pocket electronic calculators relied on microchips designed especially for them — chips that could add, subtract, multiply and divide decimal numbers, but little else. That changed with Busicom’s exceptionally beige 141-PF,4 a desktop calculator that arrived on the Japanese market in 1971, and which was, for the first time, powered by a programmable CPU. That chip was Intel’s 4004, and it kickstarted a revolution in computing that shows no signs of running out of steam.

A Busicom 141-PF desktop calculator, imported to the USA and rebranded as a "Unicom 141P"
A Busicom 141-PF, imported to the USA and rebranded as a “Unicom 141P”. (Image courtesy of Michael Holley.)

Outwardly, the 141-PF was, and I cannot emphasise this enough, a very boring calculator. Rather than use an electronic display, it printed its calculations onto a reel of paper tape, a feature beloved of accountants, but that was the extent of its novelty. Its plastic casing was redolent of cash registers and mundanity.

Yet the 141-PF opened the door for programmable CPUs to be used in more exotic calculators. American readers of a certain age will remember the TI-81 graphing calculators they encountered in high school math class, and which benefited from a CPU called the Zilog Z80.5 The Z80 had been designed by Federico Faggin, one of the 4004’s creators, and it was a roaring success: Z80 chips went on to be used not only in calculators, but also games consoles, home computers (such as the Radio Shack TRS-80 and the Sinclair ZX Spectrum), and a host of other electronic devices.6,7

The Z80 was so successful that it stayed in production for almost half a century; only now, forty-eight years after its development, is manufacturing being wound up.7 In recent years, Z80s were often put to use in “embedded” systems — that is, as general-purpose chips to drive special-purpose devices such as MP3 players, home appliances and aircraft electronics — but this less glamorous occupation should not distract us from the Z80’s staggering longevity. This is a CPU from the dawn of modern computing; a dinosaur that managed to escape extinction. It stayed in production longer than the Ford Model T or the Boeing 707, two other pioneers fêted in their spheres, and, indeed, longer than any of the calculators and computers that it powered.

There isn’t necessarily a pithy anecdote or lesson to be learned here. But if I take away anything from the Z80’s long and productive life, it’s that we live at a time where it is possible to view a sliver of silicon overlaid with microscopically fine circuitry as a commodity — a nugget of logic and memory and wiring ready to be dropped into this gadget and that one without worrying too much about the mechanical and scientific advancements that let us fabricate it in the first place. We are living in a time of wonders, in other words, but it is easy to forget it.


Neatly (?) bookending this post, towards the end of Empire of the Sum I write about what happened to the calculator after its time in the sun. The answer, broadly, is computers. Computers happened, and software happened, and spreadsheets happened. Yet the calculator did not die.

If you pick up your smartphone, you’re very likely to find a calculator app at your fingertips.* The same goes for Windows PCs and Apple’s Macs — in fact, a calculator application of one sort or another has been present on most computers since 1970 or before, when Unix’s dc, or “desktop calculator” program was written for the PDP-11 minicomputer.8

Even so, there is at least one notable calculation desert in today’s computing landscape: at the time of writing, Apple’s popular iPad line of tablet computers lacks a built-in calculator app. Happily, however, this is soon to be remedied. According to the MacRumors website, Apple will incorporate a calculator app in the next big update to its iPadOS operating system, due around September this year.

Is the calculator dead? No, not by a long shot. And if you happen to own an iPad, it is even undergoing something of a resurrection.

1.
Kirschhock, Maximilian E., Helen M. Ditz, and Andreas Nieder. “Behavioral and Neuronal Representation of Numerosity Zero in the Crow”. Journal of Neuroscience 41, no. 22 (June 2, 2021): 4889-4896. https://doi.org/10.1523/JNEUROSCI.0090-21.2021.

 

2.
Starr, Michelle. “Crows Can Actually Count Out Loud, Amazing New Study Shows”. ScienceAlert, sec. nature.

 

3.
Liao, Diana A., Katharina F. Brecht, Lena Veit, and Andreas Nieder. “Crows ‘count’ the Number of Self-Generated Vocalizations”. Science 384, no. 6698 (May 24, 2024): 874-877. https://doi.org/10.1126/science.adl0984.

 

4.
IPSJ Computer Museum. “Busicom 141-PF”. Accessed October 8, 2021.

 

5.
Woerner, Joerg. “Texas Instruments TI-81”. Datamath Calculator Museum.

 

6.
“Chip Hall of Fame: Zilog Z80 Microprocessor”. IEEE Spectrum, June 2017.

 

7.

 

8.
Ritchie, D. M. “The UNIX System: The Evolution of the UNIX Time‐sharing System”. AT&T Bell Laboratories Technical Journal 63, no. 8 (1984). https://doi.org/10.1002/j.1538-7305.1984.tb00054.x.

 

*
You may not own a pocket calculator, in other words, but it is a fair bet that you have one in your pocket anyway. 

Miscellany № 102: books!

In the second of this miniseries of post-deadline catch-ups (the first dealt with punctuation), I’ve collected some links on the subject of books.


First is a recent exhibition at Harvard’s Houghton Library, called “Marks in Books”, that has, sadly, run its course. But John Overholt, a curator of early books and manuscripts at Houghton, writes to say that the exhibit was adapted from a 1984 exhibition on the same subject and that the catalogue of that earlier incarnation is available online.

And that catalogue, my goodness. The introduction, penned by a Harvard librarian named Roger E Stoddard, reads as if it was the prologue to one of MR James’s celebrated ghost stories,* which, for the uninitiated, often begin with a hapless academic uncovering some cryptic historical artefact, and which are invariably told with a kind of gentle stuffiness that belies their unsettling plots:

In the spring of 1973 during my bi-annual acquisitions trip I visited in London the premises of E. P. Goldschmidt Ltd. as two generations of Harvard librarians had before me. Pulling down and leafing through books is one of the disciplines of acquisitions work, so when I came upon a small folio bound in reverse calf, I took it and opened it even though its binding signified business or law rather than my desiderata, arts and sciences. Opening up the book revealed the most intense patterns of decoration and annotation that I had ever encountered in a sixteenth-century book.

In James’s hypothetical story, the protagonist would now find himself transported to early modern Germany, or beset by some hideous ancient creature, but Stoddard is luckier: he has found a 1509 work decorated with paragraph marks, rubricated capital letters, manicules and copious reader’s notes — glosses, marginalia, summaries and index words.

If Stoddard’s introduction doesn’t hold your attention, the rest of the catalogue will. There are examples here of every kind of post-hoc mark in a book that you might ever expect to meet: pilcrows, asterisks, obeli, hyphens of all sorts, catchwords, illustrations, comments, notes, signatures, stamps, and translations. I wish I’d known about it while writing Shady Characters! Have a read, if you can; it’s the next best thing to having seen the exhibition itself.

A spread of two pages from a printed book (Apple-Blossoms: Verses of Two Children, 1879) decorated by hand with flowers, grasses and bees.
This hand-painted spread comes from a book (Apple-Blossoms: Verses of Two Children, 1879) that formed part of the Houghton Library’s “Marks in Books” exhibition. (Image courtesy of the Houghton Library.)

And if you manage to consume the “Marks in Books” catalogue and still have a yearning for marginalia afterwards, consider Micheline White’s illuminating (I’m sorry) 2013 essay on illuminations and other marks in books distributed by Katherine Parr, Henry VIII’s last wife, at Cambridge University Press. I am in no way an expert on the quote-late Henrician court-unquote, but White’s paper nevertheless contains lots of intriguing details about one particular decorated book and has a host of manicules to ogle, too.


You wait ages for an article on the advent of paper in medieval bookbinding and then two come along at once. In November and December last year, Yungjin Shin, a conservator at the Thaw Conservation Center of New York’s Morgan Library, published a pair of articles on how bookbinders managed the transition from parchment to paper by means of reinforcing parchment strips. Read them here and here; Shin wears her expertise lightly, and, yet again, I wish I’d had access to her articles as I wrote The Book!


In “Case-endings and Calamity” at the London Review of Books, Erin Maglaque reviews a new book on the life of Aldus Manutius, the Venetian printer whose books established many of the traditions still followed by book designers today. Her review is a great read, and, if it is anything to go by, Aldus Manutius: The Invention of the Publisher by Oren Margolis should be excellent too.


As a postscript to my last post, I was saddened to read that the St Andrews bookshop J&G Innes has closed its doors for the final time. Quite apart from the striking inscription above the door and its Arts and Crafts design (have a look at them here, in an earlier post), it was, apparently, the town’s oldest independent bookseller. It’s a pity to see such a venerable bookshop close up shop — a reminder that we need to support our local businesses if we don’t want them suffer the same fate.

*
Standard Ebooks, a volunteer-run website that produces some of the best ebook editions of classic works, has a what looks to be a fairly comprehensive collection of James’s work. I urge you to read it! 
“Reverse calf” being calf leather used with the flesh side outward that has been roughened slightly to resemble suede. 

Miscellany № 101: back to our scheduled programming

And you’re back in the room!

I recently submitted the manuscript for my next book, Face with Tears of Joy: a Natural History of Emoji to my editor, Brendan Curry, at W. W. Norton. This one was a bit of a whirlwind: Empire of the Sum was published less than a year ago, so writing time has been short. Add in a recent relocation from Birmingham, England to Linlithgow, Scotland (the birthplace of Mary Queen of Scots, no less), along with all of the attendant upheaval with jobs and schools and houses, and it has not been a restful few months 😅

That said, Face with Tears of Joy was a fun book to write, harking back, in some ways, to the style and content of Shady Characters. You could even call it a sequel, if you like. One thing the two books definitely shared was a reliance on a community of enthusiasts who were ready to help whenever I needed to dive into some obscure aspect or another of the subject at hand. Emoji are not as old as punctuation (or are they), but there are already strong traditions of emoji historiography, lexicography and archaelogy — to the extent that I was rewriting the first few chapters of the book until the last day before my deadline, as older and older emoji were uncovered.*

All of this is to say that I’m simultaneously happy to have had the opportunity to write Face with Tears of Joy while also very glad to have some down time on the book-writing front.

Now that I have some spare time to play with, I’m going to try to get through some of my backlog of links. We kick off this week with a few recent stories and articles on punctuation. I’ll post similar round-ups for books and calculators in the coming weeks, but for now, off we go!


Here in the UK, North Yorkshire council has announced that it will no longer use apostrophes when making new street signs. The Guardian presents the example of “St. Mary’s Walk”, and says that the change is supposed to make it easier to handle street name searches within their databases.

We saw a very similar story play out back in 2013. Then, it was Mid Devon council doing the apostrophe down, and, with only three Mid Devon street names actually in possession of an apostrophe, I was broadly on their side. A decade on, and I am minded to take the opposite view. As a software engineer by trade, I am perplexed by North Yorks’s justification for dropping apostrophes: yes, a rogue apostrophe can play havoc with a carelessly written computer program, but dealing with punctuation in English language texts is just part of the cost of doing business. If you’re going to forgo apostrophes, do it for principled reasons, not because they are moderately annoying for the IT crowd.

But perhaps I shouldn’t worry too much. Within days of that 2013 episode, Mid Devon’s council leader had reversed the council’s decision, saying he found it “[un]acceptable that incorrect grammar was being used on the council’s street signs.” And in 2014, Cambridge city council would institute and then roll back a similar apostrophe ban after a public outcry. I have my fingers crossed that North Yorks will soon see the light.


Strong Language, the blog on sweary language edited by Stan Carey, is always a good read. In this entry from last year, Nancy Friedman explores the cr**t*ve use of ast*r*sks at a zeitgeisty style newsletter called “Blackbird Spyplane”. Friedman’s article is worth your time, as is dipping your toe into the cheery linguistic brutalism of Blackbird Spyplane itself, but here is the scoop behind Blackbird’s asterisk usage, as told to Friedman by Blackbird’s editor, Jonah Weiner:

1. I learned from listening to [the podcast] Time Crisis with Ezra Koenig that a good-natured upbeat conversation that’s full of bleeped curses is just funny on a formal level, and I wanted to try to simulate that in print, and relatedly 2. I tend to use so many curses that it might risk feeling a bit too harsh in aggregate without the asterisks softening the effect.

There you have it.


In writing Face with Tears of Joy, I spent a lot of time in the weeds of the Unicode standard, the document that governs how computers encode and exchange text. It is, believe it or not, a fascinating subject, and that is at least partly because of the sheer scale of the endeavour: to encode every character that humans habitually write with, or habitually wrote with. That ambition leads to occasional anomalies: emoji are one, for reasons that I cover in my book, and Unicode’s Japanese “ghost characters”, where seemingly meaningless characters have somehow made it into the standard, are another. Paul O’Leary McCann explains how it happened.


Back in March 2023, I visited St Andrews in Fife, Scotland, and was charmed by the faux-medieval premises of J&G Innes, booksellers. I was also intrigued by a sign above their door:

A picture of a painted sign that reads: "Here stood the house of Bailie Bell, who, before 1744, was an eager co-worker with Alexander Wilson, the father of Scottish type-founding, and John Baine in whose type-foundry in Philadelphia the first $ sign was cast in 1797."
The sign above the door at J&G Innes, St Andrews. (Photo by the author.)

That sign took me down a historical rabbit-hole to discover how this shop in Fife could be related to the earliest printed American dollar sign. (The full story is in that earlier article.) One thing I wasn’t able to do was pin down the origins of the dollar sign itself, but long-time Shady Characters reader Alex Jay has since got in touch with an origin story that I hadn’t known about at the time.

Alex pointed me towards an article in The Business Man’s Magazine, November 1906, in which writer E. L. Wilson discusses a book published in 1797 — the same year as the first printed dollar sign, and barely two decades after the nascent USA had declared independence. The authors of that book, called the “American Accomptant”, described how the States’ many and varied currencies could be reconciled with one another, and called out the symbols that should be used to denote the resultant “federal currency”. Those marks were one slash (/) for cents, two slashes for dimes and an ‘S’ overlaid by two slashes for a dollar. The slashes, Wilson theorises, may have come from Britain’s habit of separating shillings and pence with a slash (“5/6”), while the ‘S’, rather more tenuously, is speculated to be a way to distinguishing American dollars from British pounds, which were usually abbreviated ‘L’ after the Roman libra pondo, or “pound in weight”.

Was Wilson, writing in 1906, onto something here? Answers in the comments, please!


Finally, I must say thank you to all of the people who have written in about Empire, and who continue to write in about The Book and Shady Characters. It is always a treat to hear about readers’ experiences with calculators, books, punctuation and typography. Information technologies all, as I realised the other day, each in its own unorthodox way. Please keep your them coming!

*
A special hat-tip to Matt Sephton for some juicy tidbits on that front. 
Never let it be said we don’t love a street sign here at Shady Characters

Shady Characters on Alan Alda’s Clear and Vivid podcast

I’m still pinching myself, but recently I had the distinct pleasure on appearing on Clear and Vivid podcast, hosted by the great Alan Alda. I knew of Alan’s work as an actor and writer from the likes of M*A*S*H*, of course, but I hadn’t known that in recent years he has moved into the world of science communication, not least with the foundation of the Alan Alda Center for Communicating Science at Stony Brook University, New York.

Clear and Vivid is the podcast arm of Alan’s science communication work, and I was very happy to be able to contribute on the subjects of punctuation and writing. We even took a little detour into the history of counting. Have a listen here, and let me know what you think!

*
There’s a title for connoisseurs of unusual typographical marks, if ever I saw one.