Shady Characters on AMSEcast: a podcast about calculators

I’ve had nuclear energy on the mind recently — a product of watching Oppenheimer, perhaps, and also the UK government’s newfound interest in nuclear power in the interest of combatting climate change. Apropos of all that, then, I was happy to appear on a recent episode of AMSEcast, the podcast of the American Museum of Science and Energy in Oak Ridge, Tennessee. The episode was hosted by the museum’s director, the genial Alan Lowe, who was kind enough to let me rabbit on at length on the subjects of counting, calculators, and computers. I really enjoyed talking to Alan, and I hope you enjoy listening too!

Miscellany № 105: blog questions challenge

There is something interesting happening with blogging. For a long time, blogs like this one were the way to opine, to share, to bloviate. Then social media came along and stole blogging’s thunder, with the average blogger gravitating towards long threads on Twitter (RIP; † ⚰; 💀; etc., etc.) or photo-heavy Instagram posts. Next came newsletters — blogs delivered by email, essentially — which finally broke the social media hegemony.

And yet, neither social media nor newsletters have ever had quite the same vibe as blogging. If you rely on social media, your posts live or die by how well they attract outrage or sympathy. If you rely on newsletters, you may be inadvertently rubbing shoulders with Nazis. In either case, the continued existence of your “platform” — your social media posts, your newsletters — depends on the whims of a company with its interests at heart, not yours.

All this is to say that I am very happy to see that blogging is having a bit of a moment, as exemplified by the so-called Blog Questions Challenge. This is a kind of internet chain letter, started by Scott Boms* on his own blog, “Documenting”, and to which I am rudely attaching myself without having been invited. The idea is that bloggers answer a few questions about where their blogs came from, how they work, and where they’re going. As such, I present to you the Shady Characters edition of the Blog Questions Challenge.


Why did you start blogging in the first place?

I wanted to write. I’m not entirely sure why, but I read a lot as a kid and books seemed to be important in some slightly mysterious way.

I eventually had an idea that there might be something interesting about the more unusual typographical marks I sometimes came across — ¶ @, *, † § and others — which led me to write what I hoped might become the constituent chapters of a book on the subject.

Having written those “chapters”, though, I didn’t really know what to do with them. Email a literary agent? A publisher? That seemed very forward, so I started Shady Characters instead and have been writing here ever since. (The agent and the publisher came along later, so it all turned out alright in the end.)

What platform are you using to manage your blog and why did you choose it?

I’ve used WordPress all this time — almost exactly thirteen years now. It seemed like the best option at the time, with endless scope for customisation and a robust network of supporters from whom to get help and inspiration.

WordPress comes in open source and commercial flavours. The first, where you have to install and administer WordPress yourself, is what I use. The second, where a commercial company such as WordPress.com or WPEngine handles all of that for you, always seemed like a very expensive way to go about things.

Now, though, WordPress’s open source and commercial faces are coming into conflict. Matt Mullenweg, who co-founded WordPress in 2003 and has maintained the air of a benevolent dictator since then, seems to be suffering from early-onset tech leader derangement. (Perhaps it’s catching.) Mullenweg has argued that for-profit companies (his own aside, of course) which benefit from WordPress’s freely available source code should be contributing more to that same source code, despite there being no legal compulsion to do so. The resulting ructions in the WordPress world have not been reassuring.

Have you blogged on other platforms before?

I used Google’s Blogspot for a personal diary a long, long time ago.

When do you feel most inspired to write?

I am not, it is fair to say, a natural writer. I have to treat it more like a job: set up a schedule and stick to is as closely as I can, family and other obligations notwithstanding. To fuel my miscellany posts, I keep a list of interesting websites, news stories and other articles as inspiration. Occasionally, though, something will pop up that I need to write about. A recent post on generative AI was one instance of that; an exploration on statistical frequency of punctuation marks was another.

Do you publish immediately after writing, or do you let it simmer a bit as a draft?

Occasionally, I’ll still be trying to figure out what a post is actually about as I’m in the middle of writing it. In those cases, there will be an extended period of writing and rewriting. Most other posts I publish as soon as I’ve finished them.

What are you generally interested in writing about?

I thought about this recently. For a long time, the header on the Shady Characters home page told readers to expect “un­usual marks of punc­tu­ation, books and book his­tory, and everything in between”. Now, though, with the publication of Empire of the Sum behind me and Face with Tears of Joy coming up this summer, things aren’t so clear-cut. For the moment, I’ve settled on this: “unorthodox information technologies”. I’m not sure it conveys exactly what I want it to, but it’s close enough.

Who are you writing for?

Hmm. Hmm. I would like to say that I’m writing for posterity — to help collate and collect stories, facts and other bits of information that deserve to be shown to a wider audience. But if I’m honest with myself, I’m writing for me — I’m writing because I enjoy the craft and the habit of it, and because each word written by a human being is another blow struck against the entropy of the universe. (I’m a lapsed physicist, in case it isn’t obvious.)

What’s your favorite post on your blog?

I honestly don’t know! Have a look at the Contents page and let me know what your favourite post is in the comments.

Any future plans for your blog? Maybe a redesign, a move to another platform, or adding a new feature?

In writing terms, I’d like to get back to a more regular cadence, which will be easier once the kids are a little older.

In design terms, I’ll be sticking with this design for a while longer. You can see the original one on archive.org; it lasted for around six years, and the current design is now pushing eight. Even so, I’m quite proud of it and I have no plans to change it any time soon.

The one thing I would like to change is the WordPress software that underpins the blog. It’s written in a programming language called PHP that I don’t especially enjoy using, and the shenanigans at the top of the WordPress community do not inspire confidence in WordPress itself. Grupetto.cc, my (very) occasional cycling blog, uses a system called Eleventy. It’s simpler and more flexible than WordPress, and I’ve been plotting a move to it for Shady Characters for a while. Time will tell when that happens.

Next?

Might I tag in Glenn Fleishman or Doug Wilson to give us their answers to these questions?

Thanks for reading!

*
Here are a few more examples from Jon Hicks, Rachel Andrew, Jasper Tandy and Aegir Hallmundur

Miscellany № 104: new year, new miscellany

Hello, and welcome to 2025. Is it that time already?


The possessive apostrophe (or rather, the abuse of the possessive apostrophe) is a recurrent guest star here at Shady Characters, but usually in the English language. Recently, though, the Guardian reported that unneeded apostrophes are infecting German, too. The so-called Deppenapostroph, or “idiot’s apostrophe”, appears when a German-language expression uses it to indicate a possessive — despite the fact that it is more correct to add an “s” on its own rather than “’s”.

Compare and contrast with the summer kerfuffle chronicled at Language Log, in which Mark Libermann summarises a spat over how to add the possessive to the surnames of those on the Democratic party’s erstwhile presidential ticket. Is it “Harris’” or “Harris’s”? “Walz’” or “Walz’s”? All happy languages are alike, one might say; each unhappy language is unhappy in its own way.

Head to Language Log to get Libermann’s professional (and sensible) take on the matter.


In Face With Tears of Joy, (available now to preorder at Amazon and Bookshop.org!), I write a little about the mysterious, blank-faced Unicode characters (□, � and others) that sometimes pop up when a computer or smartphone doesn’t support the latest emoji. I was happy to see an in-depth treatment of those same characters pop up at the website of Thomas Phinney, a typographer and font expert.

If Thomas’s name is familiar, it’s because he has in the past helped detect fraud by means of minute inspections of printed text and the application of a detailed knowledge of different fonts’ features and quirks. There’s lots to read on that subject and others at his website!


Although it didn’t start life as an emoji, the fact that the peace sign (☮️) has been inducted into Unicode’s hallowed emoji halls is an indication of how potent a symbol it is. Or perhaps, how potent a symbol it was. At the New York Times, Michael Rock isn’t sure that ☮️ carries the same weight it once did. What’s your take? Are we in danger of losing this once-contentious, once-ubiquitous symbol?


That’s all for this week! Happy new year, and may your 2025 be filled with information technologies of the most unorthodox sort.


The 2024 Shady Characters gift guide

It’s that time of the year again!

You: a discerning reader of books about unconventional information technologies (unusual marks of punctuation, say, or pocket calculators). Your friends and family: the same, naturally. But what gifts to give them this holiday season?

I am here to help. And because I am still very much on a calculator jag, we will be concentrating on some of the best pocket calculators out there — the cleverest, the longest-lived, and even just the hands-down–shucks–goshdarnit best looking models available to buy.

1. Braun ET66 reissue

A Braun ET-66 calculator
A Braun ET-66 calculator. (CC BY-SA 2.0 image courtesy of Pinot Dita at Wikimedia Commons.)

Dieter Rams, Braun’s totemic lead designer from 1962 until 1995, created many of the German company’s most recognisable products, from hi-fi gear to cigarette lighters and alarm clocks. But Rams also ventured into the world of the pocket calculator. From 1977 to 1987, and in partnership with a designer named Dietrich Lubs, Rams created a line of pocket calculators that embodied his motto of “less, but better”.1

The ET66 of 1987 was perhaps the apogee of the company’s calculators. To glance at one in passing is to be less than impressed — the ET66 does a very good impression of being a very average pocket calculator — but there is a consistency of shape and colour in its different elements that, on a closer look, elevates it from being merely a pocket calculator to something closer to the pocket calculator.2

In fact, so clean and logical is the ET66’s design that Apple Computer once sold them as part of the “Apple Collection”, a mail-order catalogue of third-party products that were deemed worthy to be associated with the famously exacting computer company. (Apple’s ET66 came with an Apple logo emblazoned on its top-right corner.)3 More recently, Jony Ive, Apple’s erstwhile chief designer, has cited Rams as one of his main influences — and, not coincidentally, the calculator application that ships with Apple’s iPhone is a very clear homage to Rams’s calculators.4

The original Braun calculators are collector’s items by now, but the good news is that you can buy a reissed one that faithfully reproduces the ET66 for just €59.

2. HP-12c

I’ve talked a lot about Hewlett-Packard’s seminal HP-35, both here on the blog and in Empire of the Sum, and with good reason. It was the world’s first pocketable scientific calculator, wrapped in a sensible and usable package — perhaps not quite as polished as Rams’s efforts, but distinctive and pleasing nonetheless — and there is a strong argument for it being the first “must have” electronic calculator. Perhaps even the first “must have” consumer electronic device.5

A 25th anniversary edition of the HP-12c calculator
A 25th anniversary edition of the HP-12c calculator. (CC BY 2.0 image courtesy of “striegel” on Flickr.)

Today, though, we set aside the HP-35 for another of HP’s most celebrated pocket calculators: the HP-12c of 1981 and beyond. The 12c forsook the 35’s scientific bent for the business world, helping its user to make interest rate, bond, and other financial calculations. And rather than the 35’s portrait form factor, the 12c distinguished itself with a none-more-’80s landscape layout, echoing the tiny credit-card calculators that were all the rage in the decade of its birth.6

If the 12c doesn’t have quite the same mythical reputation of its more famous sibling, it has nevertheless outlived it by a significant margin. Forty-three years after its release, the 12c is still on sale. For $49.99, you can buy a modern-day HP-12c whose colour scheme, dimensions, button layout and features are identical to its earliest incarnations.7 And I heartily encourage you to do so. The accountant, Rotary club treasurer or merchant banker in your life will thank you for it — if they don’t already own one.

3. Literally any slide rule

Slide rule owned by Sally Ride, the first American woman in space.
Slide rule owned by Sally Ride, the first American woman in space. (CC0 image courtesy of the National Air and Space Museum.)

To be clear, I am not suggesting that you break into the Smithsonian and steal Sally Ride’s slide rule in particular. But slide rules in general are irresistible to any mathematically-inclined human being. Compact, clever, tactile and collectible, they are also, most helpfully, almost completely unknown to anyone under the age of fifty. Once, science fiction writers wrote admiringly of the slide rule as the key to mathematical enlightenment (Robert A. Heinlein’s Have Space Suit — Will Travel8 being the mostly widely-cited example); now, they litter antique stores and desk drawers, their magic intact but unappreciated.

Where to start when buying a slide rule? eBay is awash with slide rules large and small, ranging from common-or-garden educational varieties to more esoteric examples such as nuclear fallout calculators and agricultural fertiliser slide rules. They start at a few dollars or pounds each and go as high as you like. I’m partial to a simple bamboo slide rule, as were made by the truckload by Hemmi Slide Rule Co., Ltd, of Japan, and rebranded by many sellers around the world. Below is a very, very bad photograph of my “Post” branded Hemmi, bought for me by my father in law for just $3.

A "Post"-branded slide rule made by Hemmi of Japan.
A “Post”-branded slide rule made by Hemmi of Japan. (Picture by the author.)

Believe me when I tell you that a slide rule will be the best, most unexpected gift that someone receives this year.

4. Casio S100X calculator

There have always been high-end calculators. The Busicom Handy-LE of 1971, for instance, considered to be the first true pocket calculator, was available in a sybaritic gold-plated variant.9 The slim, potentially explosive Sinclair Executive was aimed at a similarly rarefied clientele. That said, pocket calculators more generally were subject to a relentless race to the bottom. Chips got smaller and cheaper; calculator manufacturers vertically integrated or died; and prices tumbled year on year until only the most ruthlessly efficient companies remained.

Casio S100X-BU desktop calculator
The Casio S100X-BU. (Image courtesy of Casio.)

Casio, in the calculator business since the very beginning, is one of those survivors. And despite the average Casio being a plasticky denizen of the maths classroom, Casio’s line of calculators is topped by a far more refined model. Enter the S100X: a desktop calculator machined from solid aluminium, with diamond cut edges, a brushed finish, and an engraved serial number that ensures that your £359.99 copy is one of a kind. (For that kind of money, a unique serial number would be the least of my demands.)

Now, the S100X is not especially advanced in terms of its ability to actually calculate things. This is a four-function calculator with percentages, square roots and tax rates bolted on, then wrapped in a shiny aluminium case. Yet I think it says something about the calculator’s enduring status as a kind of mathematical avatar that Casio went to the trouble to dress up such a mundane device in such an overwrought package. If money were no object, the S100X would make an excellent gift.

5. Empire of the Sum

The sun rises behind a pocket calculator, whose display reads "07734"
The cover of Empire of the Sum.

Okay, okay, I am cheating. Empire of the Sum is a book, not a calculator. Even so, I hope you will consider giving it this year as a gift. Book sales are essential to keep all of this going — the blog, the books — and every copy sold helps. And if this post hasn’t convinced you that the pocket calculator is a subject worthy of your time, other books are available! Whichever one you buy, or you give, I hope its reader enjoys it.

1.
“Dieter Rams”. Design Museum. Accessed December 14, 2024.

 

2.
Rams, Dieter, and Dietrich Lubs. ET66 Calculator. 1987. Victoria & Albert Museum Furniture and Woodwork Collection.

 

3.

 

4.
rams foundation. “Jonathan Ive”. Accessed December 14, 2024.

 

5.
Hughes, Jim. “The HP-35”. Codex99.

 

6.
Hicks, David G. “HP-12C”. The Museum of HP Calculators. Accessed November 25, 2021.

 

7.
hp.com. “HP 12C English Calculator”. Accessed December 15, 2024.

 

8.
Heinlein, Robert Anson. Have Space Suit—Will Travel. New York: Pocket Books, 2005.

 

9.
Calcuseum. “Busicom LE120GA”. Accessed December 15, 2024.

 

Bombs, books and generative AI: a blunt and obvious parable

1. Loomings

You will be aware, by now, of the cresting wave of excitement about generative AI. “Artificial intelligence”, in general terms, means any computer program which displays humanlike abilities such as learning, creativity, and reasoning,* while “generative AI” applies to AI systems that can turn textual prompts into works of — well, not art, at any rate, but maybe “content” better captures the quality and quantity of the resultant material. Generative AI can produce text (in the case of, for example, OpenAI’s ChatGPT), images (Midjourney being the poster child here) or even music (like Suno).

Broadly speaking, generative AI works like this. An layered collection of mathematical functions called an artificial neural network is adjusted from its default state — “trained”, in the parlance — by a succession of sample inputs such as texts or images. Once trained, this network can respond to new inputs by performing (and I am over-simplifying here to a criminal degree) a kind of “autocomplete” operation that creates statistically likely responses informed by its training data. As such, the works created by genAI systems are simultaneously novel and derivative: an individual text or image produced by your favoured AI tool may be new, but such outputs cannot escape the gravity of their corresponding training data. For AI, there truly is nothing new under the sun.

Layers within a deep neural network
Input, output, and hidden layers in an artificial neural network. (CC BY-SA 4.0 image courtesy of BrunelloN.)

Despite the mundane underpinnings of most AI tools (“wait, it’s all just data and maths?”), many of them are genuinely good at certain tasks. This is doubly true when a network can be trained on a large enough set of inputs — say, for example, a slice of the billions of publicly-available web pages, a newspaper’s archive, or a well-stocked library. All of the most successful genAI systems have benefitted from opulent training datasets so that, for instance, ChatGPT is an eager, helpful and exceptionally knowledgeable conversationalist. Midjourney and Suno, too, can surprise with their ability to create pictures and songs that are at least halfway convincing as simulacra of human works. From a certain point of view, and perhaps with a dab of Vaseline on the lens, the promise of AI has already been fulfilled.

2. Pitfalls

Yet many avenues of criticism remain open to the generative AI sceptic. Most of them are valid, too.

To start at the beginning, training is a slow and intensive process. A neural network, or “model”, for text generation might have to consume billions of pieces of text to arrive at a usable state. Meta’s “Llama 3” system, for instance, which the company has made publicly available, was trained on 15 trillion unique fragments of text.3 Llama 3.2 comes in different flavours, but its largest version, the so-called “405B” model, contains more than 400 billion individual mathematical parameters.4 One paper estimates that OpenAI have spent at least $40 million training the model behind ChatGPT, and Google has spent $30 million on Gemini, while training of bigger models in the future may run to billions.6

Making use of those trained models is expensive, too. An OECD report on the subject says that a single AI server can consume more energy than a family home, and guzzles water into the bargain: a single conversation with ChatGPT is thought to use up around 500ml of water for power generation and cooling alone.7 When you converse with an AI model, you might as well have poured a glass of water down the drain.

An image generated by Microsoft Designer in response to the prompt “Videogame character that resembles Mario the plumber”.

GenAI’s thirst for training data begets another problem: those enormous training datasets often contain copyrighted works that have been used without their owners’ permission.§ The New York Times and other papers are currently suing Microsoft and OpenAI for copyright infringement, to pick only one prominent example.8 AI companies protest that simply training a model on a copyrighted work cannot infringe its copyright, since the work does not live on as a direct copy in the trained model — except that researchers working for those very same AI companies have debunked that assertion. Employees of Google and DeepMind, among others, have successfully tricked image-generation tools into reproducing some of the images on which they were trained,9 and it isn’t very hard to convince AI tools to blantantly infringe on copyrighted works.

This leads to a related criticism. For the likes of Google to have been ignorant of the training data hiding in their AI models is a sign that it is, at a fundamental level, really difficult to comprehend what is going on inside these things.10 The basic operating principles of a given model will be understood by its developers, but once that model’s billions of numerical parameters have been massaged and tweaked by the passage of trillions of training inputs, there is too much data in play for anyone to truly understand how it is all being used. In my day job I work with medical AI tools, and the doctors who use them worry acutely about this issue. The computer can tell them that a patient may have cancer, but it cannot tell them why it thinks that. The “explainability” of AI still has a long way to go.

Then there are the hallucinations. This evocative term came out of computer vision work in the early 2000s, where it referred to the ability of AI tools to add missing details to images. Then, hallucination was a good thing, since this was what these tools were trying to accomplish in the first place. More recently, however, “hallucination” has come to mean the way that genAI will sometimes exhibit unpredictable behaviour when faced with otherwise reasonable tasks.11 Midjourney and other image generators were, for a long time, prone to giving people extra fingers or limbs.12 An AI tool trialled by McDonald’s put bacon on top of ice cream and mistakenly ordered hundreds of chicken nuggets.13 OpenAI’s speech-to-text system, Whisper, fabricated patients’ medical histories and medications in an alarming number of cases.14 True, these are not hallucinations in the human sense of the word. It’s more accurate to call them ghosts in the probabalistic machine — paths taken through the neural net which reveal that mathematical sense does not always equate to common sense. But whatever they are called, the results can be startling at best and dangerous at worst.

There are also philosophical objections. At issue is this: humans, it turns out, want to work. And not only to work, but to do good, useful work. Back in the nineteenth century, the Luddites worried about skilled workers being dispossessed by steam power.15 The proponents of the fin de siècle Arts and Crafts movement, too, epitomised by the English poet and textile designer William Morris,|| agitated for a world where people made things with their own two hands rather than submit to the stultifying, repetitive labour of the production line. Whether for economic or spiritual reasons, both recognised the value of human craft.

More recently, the anarchist and anthropologist David Graeber expressed similar sentiments in a widely-read 2013 essay, “On the Phenomenon of Bullshit Jobs”. In it, he notes that the twentieth century’s relentless increases in productivity have gifted us not fewer working hours or a boom in human happiness but rather a proliferation of low-paid, aimless jobs, in which workers twiddle their thumbs in make-work roles of whose pointlessness they are acutely aware.16

Neither nor David Graeber nor his historical counterparts were complaining about AI, but the point stands. If corporations view genAI as a means to replace expensive human workers (and to be clear, that is exactly how they see it,17 just like mill owners and car manufacturers before them), then we are in for yet more “bullshit jobs”.

None of these complaints should be all that surprising to us. All computers eat energy and data and turn them into heat and information, so if we force-feed them with gigantic quantities of the former then we will get commensurately more of the latter. And if history has taught us anything, it is that technlogies are tools rather than panaceas, and that work-free utopias are far less durable than, say, exploitative labour laws and unequal distributions of wealth.

3. The bomb pulse

To backtrack to an earlier and more optimistic point, in many cases AI tools can be remarkably good at their jobs. A friend of mine likened the large language models that underpin most text-based generative genAI systems to an intern who has read the entire internet: they will not always make the same deduction or inference that you would have done, but their breadth of knowledge is astounding. It is hard to argue that we have built something incredible here. If we can stomach the cost in power and in water, then genAI promises to not only solve problems for us, but to do so at a ferocious rate — and here is where a new and interesting class of AI problems arise. To see why, let me take you back to the heyday of another morally ambiguous technological marvel.

Nuclear weapons testing, where the likes of the USSR, USA and UK exploded hundreds of atomic bombs in the air and underground, reached a peak in 1961-62 after which a test ban treaty led to a dramatic slowdown.18 But the spike in tests before the ban, which contaminated Pacific atolls and Kazakh steppe alike, sent vast clouds of irradiated particles into the atmosphere that would leave a lasting mark on every living thing on earth.19,20

That mark was the work of a massive rise in the amount of atmospheric carbon-14, a gently radioactive isotope of carbon that filters into human and animal food chains through photosynthesis and the exchange of carbon dioxide between the air and the ocean. Carbon-14 is essential to radiocarbon dating, because it allows researchers to match the amount of carbon-14 in a material (taking into account the rate at which it decays into plain old carbon-12) with a calibration curve that shows how much carbon-14 was floating around in earth’s biosphere at any particular point in the past.21 It is a powerful and useful tool, with everyone from archaeologists to forensic criminologists using it to determine the age of organic matter.

A graph showing the "bomb pulse" of atmospheric carbon-14
The “bomb pulse” of atmospheric carbon-14. (Public domain mage courtesy of Hokanomono.)

The so-called bomb pulse of carbon-14, then, which was measured and quantified to the minutest detail, made the calibration curve much, much more accurate. It became possible to date just about every living organism born after 1965 to an accuracy of just a few years — a dramatic improvement on the pre-atomic testing era, when carbon-14 concentrations ebbed and flowed across centuries and millenia in a confusing and sometimes contradictory manner.22 If a mushroom cloud could ever be said to have had a silver lining, it was the bomb pulse.

Just as nuclear testing pushed carbon-14 through the roof, generative AI is dramatically increasing the quantity of artifically-generated texts and images in circulation on the web. For instance, a recent study found that almost half of all new posts on Medium, the blogging platform, may have been created using genAI.23 Another report estimated that billions of AI-generated pictures are now being created every year.24 We are living through a bomb pulse of AI content.

Unlike its nuclear counterpart, however, the explosion of AI content is making it harder, not easier, to make observations about our world. Because AI apps are incapable of meaningfully transcending their training data, and because a non-trivial proportion of AI outputs can be objectively quantified as gibberish, the rise in AI-generated material is polluting our online data rather than enriching it.

To wit: Robyn Speer, the maintainer of a programming tool that looks up word frequencies in different languages, mothballed his project in September 2024 because, he wrote, “the Web at large is full of slop generated by large language models, written by no one to communicate nothing.”25 Elsewhere, Philip Shapira, a professor at Manchester University in the UK, connected an inexplicable rise in the popularity of the word “delve” with ChatGPT’s tendency to overuse that same word.26,27 There is already enough AI “slop” out there to distort online language.

Indeed, there is now so much publicly available AI-generated data that training datasets are being contaminated by synthetic content. And if that feedback loop occurs too often — if AI models are repeatedly trained on their own outputs, or those from other models — then there is a risk that those models will eventually “collapse”, losing the breadth and depth of knowledge that makes them so powerful. “Model Autophagy Disorder”, as this theoretical pathology is called, is mad cow disease for AI.28 Where herds of cattle were felled by malformed proteins, AI models may be felled by malformed information.

4. Burning books

And so, if I may be permitted (encouraged, even) to bring things to a point, I will submit as concluding evidence a story published by the tech news outlet 404 Media. Emanuel Maiberg’s piece, “Google Books Is Indexing AI-Generated Garbage” reports that, well, Google Books appears to be indexing AI-generated garbage. (Never change, SEO.) Searching for the words “as of my last knowledge update” — a tell, like “delve”, of OpenAI’s ChatGPT service — Maiberg uncovered a number of books bearing ChatGPT’s fingerprints.29 That was in April this year, and Maiberg noted with cautious optimism that the AI books were mostly hidden farther down Google’s search results. The top results, he said, were mostly books about AI.

In writing this piece I tried Maiberg’s experiment for myself. In the six months, give or take, since his article was published, the situation has reversed itself: on the first page of my search results, every entry bar one was partly or wholly the work of ChatGPT. (The lone exception was, as might be hoped, a book on AI.) Most results on the second page also were the products of AI. The genAI bomb pulse would appear to be gathering pace.

I’ll be honest: I had been ambivalent about generative AI until I hit the “🔍” button at Google Books. I’ve tried ChatGPT and Google Gemini as research tools, but despite their breadth of knowledge they sometimes missed details that I felt they should have known. I’m also one of those humans who enjoys doing their own work, so delegating writing or programming tasks to AI has never really been on the cards. If I need help with a problem as I write a book or design a software system, I’ll run a web search or ask a human for help rather than an environmentally-ruinous autocomplete algorithm on steroids. Not everyone has the same qualms, though, or the same needs, and that’s fine. But this AI-powered dilution of Google Books still feels like a step beyond the pale.

For millennia, books were humanity’s greatest information technology. Portable, searchable, self-contained — but more than that, authoritative in some vague but potent way, each one guarded by a phalanx of editors, copyeditors, proofreaders, indexers and designers. Certainly, none of my own books could have come to pass without many other books written before them: I’ve lost count of the facts in them that rely upon some book or another found after a trawl through Google Books, the Internet Archive, or a physical library.

Books have never been perfect, of course. No artefact made by human hands ever could be. They can be banned or burned or ignored. They can go out of date or out of print. They can pander to an editor’s personal whims or a publisher’s commercial constraints, or be axed entirely on a lawyer’s recommendation. Google Books can be criticised too — ironically, it is accused mostly of the same lax approach to copyright as many AI vendors — yet it is an incredibly useful tool, acting as an omniscient meta-index to many of the world’s books. It would take a brave protagonist, I think, to argue that books have not been, on balance, a positive force.

None of this is to criticise AI itself. Costs will decrease as we figure out how to manage larger and larger datasets. Accuracy will increase as we train AI models more smartly and tweak their architectures. Even explainability will improve as we pry open AI models and inspect their internal workings.

Ultimately, the problem lies with us, as it always does. Can we be relied upon to use AI in a responsible way — to avoid poisoning our body of knowledge with the informational equivalent of malformed proteins? To remember that meaningful, creative work is one of the joys of human existence and not some burden to be handed off to a computer? In short, to spend a moment evaluating where AI should be used rather than where it could be used?

What’s left for me at the end of all this is a creeping, somewhat amorphous unease that we are unprepared for the ways in which generative AI will change books and our ability to access them. I don’t think AI will kill books in any real sense, just as ebooks have not killed physical books and Spotify has not killed radio, but unless we can restrain our worst instincts there is a risk that books — and by extension, knowledge — will emerge cheapened and straitened from the widening bomb pulse of generative AI.

1.
Mirzadeh, Iman, Keivan Alizadeh, Hooman Shahrokhi, Oncel Tuzel, Samy Bengio, and Mehrdad Farajtabar. “GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models”. arXiv, October 7, 2024. https://doi.org/10.48550/arXiv.2410.05229.

 

2.
Doshi, Anil R., and Oliver P. Hauser. “Generative AI Enhances Individual Creativity But Reduces the Collective Diversity of Novel Content”. Science Advances 10, no. 28 (July 12, 2024): eadn5290. https://doi.org/10.1126/sciadv.adn5290.

 

3.

 

4.

 

5.
Meta Llama. “Llama 3.2”. Accessed October 25, 2024.

 

6.
Cottier, Ben, Robi Rahman, Loredana Fattorini, Nestor Maslej, and David Owen. “The Rising Costs of Training Frontier AI Models”. arXiv, May 31, 2024. https://doi.org/10.48550/arXiv.2405.21015.

 

7.

 

8.
Robertson, Katie. “8 Daily Newspapers Sue OpenAI and Microsoft Over A.I”. The New York Times, sec. Business.

 

9.
Carlini, Nicholas, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, and Eric Wallace. “Extracting Training Data from Diffusion Models”. arXiv, January 30, 2023. https://doi.org/10.48550/arXiv.2301.13188.

 

10.
Burt, Andrew. “The AI Transparency Paradox”. Harvard Business Review.

 

11.
Maleki, Negar, Balaji Padmanabhan, and Kaushik Dutta. “AI Hallucinations: A Misnomer Worth Clarifying”. arXiv, January 9, 2024. https://doi.org/10.48550/arXiv.2401.06796.

 

12.

 

13.

 

14.
Koenecke, Allison, Anna Seo Gyeong Choi, Katelyn X. Mei, Hilke Schellmann, and Mona Sloane. “Careless Whisper: Speech-to-Text Hallucination Harms”. arXiv, May 3, 2024. https://doi.org/10.48550/arXiv.2402.08021.

 

15.
Magazine, Smithsonian, and Richard Conniff. “What the Luddites Really Fought Against”. Smithsonian Magazine.

 

16.
Graeber, David. “On the Phenomenon of Bullshit Jobs”. STRIKE! Magazine.

 

17.
Manyika, James, and Kevin Sneader. “AI, Automation, and the Future of Work: Ten Things to Solve for”. McKinsey & Company.

 

18.
“Radioactive Fallout”. In Worldwide Effects of Nuclear War.

 

19.
Nuclear Museum. “Marshall Islands”. Accessed October 25, 2024.

 

20.
Kassenova, Togzhan. “The lasting toll of Semipalatinsk’s nuclear testing”. Bulletin of the Atomic Scientists (blog).

 

21.
Hua, Quan. “Radiocarbon Calibration”. Vignette Collection. Accessed October 25, 2024.

 

22.
Hua, Quan. “Radiocarbon: A Chronological Tool for the Recent past”. Quaternary Geochronology, Dating the Recent Past, 4, no. 5 (October 1, 2009): 378-390. https://doi.org/10.1016/j.quageo.2009.03.006.

 

23.
Knibbs, Kate. “AI Slop Is Flooding Medium”. Wired.

 

24.

 

25.
Speer, Robyn. “Wordfreq/SUNSET.Md”. GitHub.

 

26.
Shapira, Philip. “Delving into ‘delve’”. Philip Shapira (blog).

 

27.

 

28.
Nerlich, Brigitte. “From contamination to collapse: On the trail of a new AI metaphor”. Making Science Public (blog).

 

29.

 

*
Although as a new paper demonstrates, generative AI tools are almost certainly not performing any sort of robust logical reasoning. Researchers at Apple found that insignificant changes to input prompts can result in markedly different (and markedly incorrect) responses.1 
Indeed, the research bears this out. Novels written with the help of AI were found to be better, in some senses, than those without — but they were also markedly more similar to one another.2 
That said, there are many smaller models out there. Llama’s own “1B” model can be run on a smartphone.5 
§
I am entirely sure that my writing has been used to train sundry AI models, and if you have ever published anything online or in print, yours likely has been too. 
||
Eric Gill, author of An Essay on Typography, was another leading light. 
An information technology so influential, one might even consider writing a book about it