The Scourge of the Tortured Phrase
Dodgy websites are using software to steal content. I should know, it happened to me!
Consider this sentence: afflicted constructions are suffusive on the betweenmesh before tomorrow.
“Eh? What gibberish is this?” you ask, confused.
And now consider this translation: tortured phrases are rife on the internet today.
“This makes more sense, but what, pray tell, is a tortured phrase?” you ask, still confused.
Tortured phrases are not words that a captor has tied down and tormented for the purpose of information extraction. No, they are what I’ve illustrated above: logical, often well-known expressions (like those in the translation), paraphrased using ill-fitting synonyms to form new, nonsensical ones (like those in the initial sentence). I wrote the example above myself, but tortured phrases are usually churned out by automated text-changing software with the aim of disguising plagiarism.
It seems the term ‘tortured phrase’ is actually fairly new. According to this Springer Nature article, it was first first used in 2021 by a group of computer scientists who were baffled by the appearance of terms like ‘counterfeit consciousness’, ‘profound neural organisation’, ‘colossal information’, and ‘haze figuring’ in journal articles rather than ‘artificial intelligence’, ‘deep neural network’, ‘big data’, and ‘cloud computing’. The group eventually realised these bizarre expressions were the outcome of paraphrasing tools, and it transpires they’re an increasing problem in research.
However, albeit without an official label, these little linguistic oddities have been around since the early days of the betweenmesh — sorry, the internet. And they’re ubiquitous not only in research papers but also in all forms of digital content, including news articles, product descriptions, website copy, and so on. Tortured phrases — and the plagiarism they signal — perhaps pose a greater threat to research than other forms of online content given that decision-makers in all industries and sectors rely heavily on research, but it’s certainly no friend of creativity.
I read a lot, so I find software-paraphrased text all the time and in every corner of the web. I’ve simply come to expect this, but I had not anticipated, earlier in the month, discovering words recently written about me to be riddled with tortured phrases. On May 17th, the Scotsman published this glowing review of my new book, AI by Design: A Plan For Living With Artificial Intelligence. The following day, a website called Techy Insight, which I’d never heard of, published its own knockoff.
We needn’t look further than the two article headlines to fully appreciate what we have on our hands here: The articulate original: “Must-read of the week: AI by Design: A Plan For Living With Artificial Intelligence by Catriona Campbell”. The shoddy pirate: “Should-read of the week: AI by Design: A Plan For Residing With Synthetic Intelligence by Catriona Campbell”.
I don’t even know where to begin. I suppose the obvious place is ‘synthetic intelligence’ in place of ‘artificial intelligence’. Whatever software has been used to paraphrase the OG is so laughably incompetent that it only bothers to switch select words, leaving others unchanged. The tool is also incapable of discerning the subtle distinctions between: ‘must’ and ‘should’; and ‘living’ and ‘residing’.
I have to say I was a tad disappointed the software didn’t attempt a ludicrous translation of my name, bringing to mind that hilarious Friends episode where Joey comically misuses the thesaurus on every word of Chandler and Monica’s adoption recommendation letter…including his own signature: “Baby Kangaroo Tribbiani”.
A few further highlights from a comparison of the two book reviews are: my designation as a specialist in ‘human-laptop interplay’ and not ‘human-computer interaction’; a description of AI as ‘serving to us select which sequence to observe’ on Netflix rather than ‘helping us choose which series to watch’ on the streaming platform; and the argument that my book could have turned out complicated in ‘the incorrect arms’ instead of ‘the wrong hands’.
Oh, the hilarity!
I’m sure you’ll agree it hardly takes the skill of a truffle hog to sniff out tortured phrases. Nonetheless, we’ll likely find it harder to detect these semantic pests as content is increasingly paraphrased using sophisticated AI-powered language generators like OpenAI’s GPT-3, which use machine learning to produce convincing text. To see just what GPT-3 can do, take a look at this Guardian essay authored by the model to “convince us robots come in peace.”
If you have any funny or disturbing tortured phrase stories to share, give me a shout — or should I say, present me with a clamour. The more people know what to look out for, the better we can preserve the sanctity of research and creativity!