Things ChatGPT told a mentally ill man before he murdered his mother

tfm@europe.pub · edit-2 20 hours ago

Things ChatGPT told a mentally ill man before he murdered his mother

brucethemoose@lemmy.world · edit-2 23 hours ago

They already do. They hide the thinking logs, just to be jerks.

But this is the LLM working as designed. They’re text continuation models: literally all they do is continue a block of text with the most likely next words, like an improv actor. Turn based chat functionally and refusals are patterns they train in at the last minute, but if you give it enough context, it’s just going to go with it and reinforce whatever you’ve started the text with.

Hence I think it’s important to blame OpenAI specifically. They do absolutely everything they can to hide the inner workings of LLMs so they can sell them as black box oracles, as opposed to presenting them as dumb tools.

thethunderwolf@lemmy.dbzer0.com · edit-2 22 hours ago

thinking logs

Per my understanding there are no “thinking logs”, the “thinking” is just a part of the processing, not the kind of thing that would be logged, just like how the neural network operation is not logged

I’m no expert though so if you know this to be wrong tell me

brucethemoose@lemmy.world · edit-2 22 hours ago

Per my understanding there are no “thinking logs”, the “thinking” is just a part of the processing, not the kind of thing that would be logged, just like how the neural network operation is not logged

I’m no expert though so if you know this to be wrong tell me

“Thinking” is a trained, structured part of the text response. It’s no different than the response itself: more continued text, hence you can get non-thinking models to do it.

Its a training pattern, not an architectual innovation. Some training schemes like GRPO are interesting…

Anyway, what OpenAI does is chop off the thinking part of the response so others can’t train on their outputs, but also so users can’t see the more “offensive” and out-of-character tone LLMs take in their thinking blocks. It kind of pulls back the curtain, and OpenAI doesn’t want that because it ‘dispels’ the magic.

Gemini takes a more reasonable middle ground of summarizing/rewording the thinking block. But if you use a more open LLM (say, Z AI’s) via their UI or a generic API, it’ll show you the full thinking text.

EDIT:

And to make my point clear, LLMs often take a very different tone during thinking.

For example, in the post’s text, ChatGPT likely ruminated on what the users wants and how to satisfy the query, what tone to play, what OpenAI system prompt restrictions to follow, and planned out a response. It would reveal that its really just roleplaying, and “knows it.”

That’d be way more damning to OpenAI. As not only did the LLM know exactly what it was doing, but OpenAI deliberately hid information that could have dispelled the AI psychosis.

Also, you can be sure OpenAI logs the whole response, to use for training later.