Meta denies torrenting porn to train AI, says downloads were for “personal use”

squirrel@lemmy.blahaj.zone · 26 days ago

Meta denies torrenting porn to train AI, says downloads were for “personal use”

vzqq@lemmy.blahaj.zone · 25 days ago

Ethics of training AI models on people’s work and likeness aside, the fact that the current models refuse to work and are not trained on NFSW text and images is a massive risk.

We know they are used for content moderation, but even abliterated versions of Qwen3-VL are not able to accurately describe anything involving sexual acts. Instead they go “an intimate photograph with a brick wall background and natural lighting”.

It’s a huge hole in the models, and it’s going to lead to trouble one way or the other.

jlow (he / him)@discuss.tchncs.de · 25 days ago

Interesting, I thought they were just censored? So the AI porn is made with other models? I thought they all basically are based on the big ones (by Google, Meta, "Open"Ai etc) but where somehow “hacked” / “uncensored” to allow adult content generation.

vzqq@lemmy.blahaj.zone · 25 days ago

Well, sort of. There is a difference between models that eat text and output images (diffusion models like Dalle and stable diffusion) and the models that eat images and text and output text (vision llms like qwen3-vl), but the way they both know what things look like is based on contrastive learning, based on an older model called CLIP and its descendants.

Basically you feed a model both images and descriptions of images and train it to produce the same output vectors in both cases. Essentially it learns what a car looks like, and what the image of a car is called in whatever languages it’s trained in. If you only train a model on “acceptable” image/descriptions it literally never learns the words for “unacceptable” things and acts.

Diffusion models are often fine tuned on specific types of porn (either full parameter or QLoRa), often with great effect. The same is much more work for llms though. Even if you remove the censorship (eg through abliteration, modifying the weights to inhibit outright denials), the models that’s left will not know the words it needs to express the concepts in the images.

jlow (he / him)@discuss.tchncs.de · 25 days ago

Ahhhh, ok. Thanks for the detailed explanation, really appreciate it!