- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
This is a list of writing and formatting conventions typical of AI chatbots such as ChatGPT, with real examples taken from Wikipedia articles and drafts. Its purpose is to act as a field guide in helping detect undisclosed AI-generated content.
The problem is that all these patterns are natural writing patterns for human beings to form sometimes. Some people use em-dashes, some people like to write structured paragraphs with proper introduction and conclusion connectivity. Some people like constantly bringing up ideas. There is no smoking gun of llm generation that you can mathematically point to and definitiely say whether or not it was llm generated. Its all vibe checks from fallable human beings. Particluarly, those who are high off their own linguistic pattern recognition abilities, who thinks anyone who doesn’t write like they do or have wierd writing patterns are potential LLMs.
The other day I happened to write “its not x, but rather y” and someone accused me is writing my comments with AI on that basis. People are going to make themselves crazy suspecting everything is machine generated this way.
Every comment section for my youtubers of choice now has people claiming their voice is AI generated. Don’t like someones opinion? Accuse them of using machine generation and their argument is invalid. Think their cadence isn’t right? voice model generated. Its now somewhere between an insult and a society wide insecurity complex.
Like you say, so many people are being driven crazy or trying to position ML use as a crime against humanity and a violation of the ‘sanctity of creativity’. These same people likely have middling conceptual ability, have almost no insight on advanced topics, and have actualized less generative potential than a basic neural network loaded onto a 10 year old fucking graphics card.
Its sad to say but most people i’ve met simply aren’t running on all cylinders cognitively. They spend their time doomscrolling, socializing, and trying to get by while knowing the bare minimum about anything they don’t have to. But they still want the ego that comes from thinking of themselves as higher beings smarter and more capable than computers in all aspects. We’re gonna have to get used to the idea that human intellect isn’t the super special secret sauce of productivity or creation anymore. Its something society is just gonna have to cope with and find ways to deal with as ML use becomes more ubiquitous and ML cooks even more advanced structures for running neural networks.
For now I appreciate the novelty of human generated content stuff with the “No AI was used to generate this content” warning but people can simply lie. It would be nice to be able to verify human creation somehow but that leas to the alternative of ask big daddy gubberment to ban AI or “prove human validation” and you know what that means. More restrictions and less freedoms for everyone.
As a long time fan of the em dash, it is truly a tragedy that using it is associated with AI. I was heavily using it before LLMs were a big thing. It allows spacing in contexts where commas and others just won’t let you. Am I to just incorrectly use a hyphen instead? Horrible 0/10
AI chatbots use the em dash (—) more frequently than most editors do, especially in places where human authors are much more likely to use parentheses or commas.
I think thats a reasonable take. But it surely depends on your personal style how often you use this type of dash.
For me, regarding every text that contains em dashes is clearly unreasonable, but using it as one of many indicators is perfectly fine.
I was overusing them before too :(
Hypothetically have had to be told that three em dashes is too many in one sentence by someone proofreading. It’s just such a good punctuation. Alas
I also use way too many em dashes (usually as double hyphens), but I also overuse parentheses and commas and just overly long sentence structures. I would like to think that my style remains pretty distinct from LLM output style at the moment.
The thing that really worries me is that as they stop using weird identifiable quirks like em dashes and emoji, it could be that the identifiable trait that remains is eerily consistent grammar. It used to be that people unconsciously treated extremely grammatical text as authoritative, regardless of its actual merit; as such, teachers spent literal decades drilling into me the habit of avoiding grammatical errors. Now that could end up instead just making folks think I’m a robot, and thus to be ignored.
I guess the actual robots will probably talk less about their neurotic concerns, though, so I’ve got that going for me, which is nice.
Speaking as a Brit: using a capital letter after a colon or a semi-colon just looks weird to me. I’m continuing a thought, not starting another mid-sentence. Using an em-dash - or even just a hyphen, I think it’s an acceptable alternative when you’ve not got adequate input available - lets me show a slight change of thought mid-sentence in a trans-Atlantic way.
Also, fuck AI.
Funny. I’m German, and in German it’s actually a rule that the word after the “:” must be capitalized. I always have to go back through my English writing and un-capitalize those words because I just can’t get used to not doing it.
Oh, interesting. A couple hundred years again, it used to be the done thing in written English to capitalise every noun in a sentence, German-style. The Yanks have “in Order to form a more perfect Union, establish Justice, insure domestic Tranquility”, for example. We’ve mostly stopped doing that now. There were a lot of German immigrants to the early US; whether they’ve taken your influence on colons, or whether it’s just pre-standardisation English and it needed to be one way or another…
We’d consider excessive capitalisation, or worse, running all-caps, to be the sign of a diseased mind, now. Not naming any names.
I just use a double hyphen and assume everyone knows I was too lazy to type the actual em dash
Microsoft Word, LibreOffice, etc. will convert a double hyphen into an em dash. That’s how I’ve always typed mine out within papers.
I type the full phrase “em dash” so everyone knows I have ample spare time and am choosing to spend it painstakingly hamecrafting my work
Note that not all text featuring the following indicators is AI-generated; large language models (LLMs), which power AI-chatbots, have been trained on human writing, and some people may share a similar writing style.
deleted by creator
We’re all suffering from this, but it’s better to cut off our tails than our heads.
I hate that salutations and valedictions are considered AI writing now. Like I get it, objectively ai uses it more. But goddamn, I’m just trying to be polite!
This is probably one of the best and most detailed descriptions on how to spot AI writing. Some of these points are well known (the EM dash, the frequent use of bold highlights), and others are very specific to wikipedia (eg. inconsistent markdown).
You’ll take the em-dash from my cold, dead hands!
Don’t fret, it’s trivial for them to run the output through something that strips the signs out.
Either people posting ai trash will have to do a tiny bit more work to evade detection (your dashes no longer ostracized) or the AI companies will just keep stripping out AI tell-tales from the output at the source, to make life easier for their customers (same effect)
Whatever happens, the slop must flow.
At the end of the day, the reason LLMs write like this is because people write like this. Good luck.
Many of the issues of LLMs mentioned in this article are also just writing that doesn’t fit wikipedias style, e.g. emotions, claims without citations.
Lmao. Youve not read many change notes have you.
God speed Wikipedia editors. That has to be such a pain.
I have this article bookmarked. very informative.
As someone who absolutely sucks at recognizing ai writing i am definitely gonna give this a good read
People will generally write in order to transfer information; perhaps a story, some information that’s new to you, or to change your mind on something. AI produces text in order to fill space. Empty paragraphs devoid of information but which continue anyway are the give away; the Wiki article has some more telltales. You might have some false-positives on people who write like machines, but you’re not losing anything by ignoring their writing.
AI writing also tends be free from fucks, spelling mistakes and odd grammar, since it’s humans that throw those into their prose.
Ah good to know thank you
Nice, thanks.