(Multimodal) GPT ≠ “pure” LLM. GPT-4o uses an LLM for the language parts, as well as having voice processing and generation built-in, but it uses a technically distinct (though well-integrated) model called “GPT Image 1” for generating images.
You can’t really train or treat image generation with the same approach as natural language, given it isn’t natural language. A binary string doesn’t adhere to the same patterns as human speech.
LLM slop factories are overtly racist because they’re trained on shit lifted straight off the internet.
That’s image generation, not LLM (language/text generation), but the point stands
Hate to bring it to you, but today’s image generation comes through LLMs
(Multimodal) GPT ≠ “pure” LLM. GPT-4o uses an LLM for the language parts, as well as having voice processing and generation built-in, but it uses a technically distinct (though well-integrated) model called “GPT Image 1” for generating images.
You can’t really train or treat image generation with the same approach as natural language, given it isn’t natural language. A binary string doesn’t adhere to the same patterns as human speech.