

Hah! Well found. I do recall hearing about another simulated vendor experiment (that also failed) but not actual dog-fooding. Looks like the big upgrade the wsj reported on was the secondary “seymour cash” 🙄 chatbot bolted on the side… the main chatbot was still claude v3.7, but maybe they’d prompted it harder and called that an upgrade.
I wonder if anthropic trialled that in house, and none of them were smart enough to break it, and that’s what lead to the external trial.

Sunday afternoon slack period entertainment: image generation prompt “engineers” getting all wound up about people stealing their prompts and styles and passing off hard work as their own. Who would do such a thing?
https://bsky.app/profile/arif.bsky.social/post/3mahhivnmnk23
Ahh, sweet schadenfreude.
I wonder if they’ve considered that it might actually be possible to get a reasonable imitation of their original prompt by using an llm to describe the generated image, and just tack on “more photorealistic, bigger boobies” to win at imagine generation.