Lemmyshitpost community closed until further notice

lwadmin@lemmy.world · edit-2 2 years ago

Lemmyshitpost community closed until further notice

TsarVul@lemmy.world · 2 years ago

I guess it’d be a matter of incorporating something that hashes whatever it is that’s being uploaded. One takes that hash and checks it against a database of known CSAM. If match, stop upload, ban user and complain to closest officer of the law. Reddit uses PhotoDNA and CSAI-Match. This is not a simple task.

diffuselight@lemmy.world · 2 years ago

None of that really works anymore in the age of AI inpainting. Hashes / Perceptual worked well before but the people doing this are specifically interested in causing destruction and chaos with this content. they don’t need it to be authentic to do that.

It’s a problem that requires AI on the defensive side but even that is just going to be eternal arms race. This problem cannot be solved with technology, only mitigated.

The ability to exchange hashes on moderation actions against content may offer a way out, but it will change the decentralized nature of everything - basically bringing us back to the early days of the usenet, Usenet Death Penaty, etc.

dragontamer@lemmy.world · edit-2 2 years ago

Not true.

A simple CAPTCHA got rid of a huge set of idiotic script-kiddies. CSAM being what it is, could (and should) result in an immediate IP ban. So if you’re “dumb” enough to try to upload a well-known CSAM hash, then you absolutely deserve the harshest immediate ban automatically.

You’re pretty much like the story of the economist who refuses to believe that $20 exists on a sidewalk. “Oh, but if that $20 really existed on the sidewalk there, then it would have been arbitraged away already”. Well guess what? Human nature ain’t economic theory. Human nature ain’t cybersecurity.

Idiots will do dumb, easy attacks because they’re dumb and easy. We need to defend against the dumb-and-easy attacks, before spending more time working on the harder, rarer attacks.

AustralianSimon@lemmy.world · edit-2 2 years ago

You don’t get their ip when they post from other instances. I’m surprised this hasn’t resulted in defed.

anlumo@lemmy.world · 2 years ago

Well, my home instance has defederated from lemmy.world due to this, that’s why I had to create a local account here.

AustralianSimon@lemmy.world · edit-2 2 years ago

I mean defedding the instances the CSAM is coming from but also yes.

rolaulten@startrek.website · 2 years ago

I’m sorry but you don’t want to use permanent IP bans. Most residential circuits are DHCP meaning banning via IP only has a short term positive effect.

That said automatic scanning of known hashes, and automatically reporting to relevant authorities with relevant details should be doable (provided there is a database somewhere - I honestly have never looked).

Touching_Grass@lemmy.world · 2 years ago

Couldn’t one small change in the picture change the whole hash?

TsarVul@lemmy.world · 2 years ago

Good question. Yes. Also artefacts from compression can fuck it up. However hash comparison returns percentage of match. If match is good enough, it is CSAM. Davai ban. There is bigger issue however for developers of Lemmy, I assume. It is a philosophical pizdec. It is that if we elect to use PhotoDNA and CSAI Match, Lemmy is now at the whims of Microsoft and Google respectively.

Lmaydev@programming.dev · 2 years ago

Honestly I’d rather that than see shit like this any day.

Serinus@lemmy.world · edit-2 2 years ago

The bigger thing is that hash detection tools don’t want to give access to just anyone, and just anyone can run a Lemmy instance. The concern is that you’re effectively giving the CSAM people a way to know if they’ll be detected.

Perhaps they can allow some of the biggest Lemmy instances to use the tech, but I wouldn’t expect it to be available to everyone.

shagie@programming.dev · 2 years ago

Facebook and Reddit don’t have local CSAM detection but rather use Google’s APIs.

This isn’t something that any average user can get access to. Even the largest Lemmy instances are small compared to Reddit and Facebook… and they don’t have local testing either.

Part of this is also a “this isn’t just detecting and blocking but also automated reporting”.

Furthermore, Lemmy is AGPL, and providing a Lemmy instance with an implementation that would run the risk that it wouldn’t be able to remain closed source (AGPL license violation).

what_is_a_name@lemmy.world · 2 years ago

Mod tools are not Lemmy. Give admins and mods an option. Even a paid one. Hell. Admins of Lemmy.world could have us donate extra to cover costs of api services.

TsarVul@lemmy.world · 2 years ago

I agree. Perhaps what Lemmy developers can do is they can put slot for generic middleware before whatever the POST request is in Lemmy API for uploading content? This way, owner of instance can choose to put whatever middleware for CSAM they want. This way, we are not dependent on developers of Lemmy for solution to pedo problem.

Nollij@sopuli.xyz · 2 years ago

If they hash the file binary data, like CRC32 or SHA, yes. But there are other hash types out there, which are more like “fingerprints” of an image. Think of how Shazam or Sound Hound can recognize a song playing, despite the extra wind, static, etc that’s present. There are similar algorithms for images/videos.

No idea how difficult those are to implement, though.

Railcar8095@lemm.ee · 2 years ago

There are FOSS applications that can do that (czkawka for example). What I’m not sure it’s if the specific algorithm used is available and, more importantly, if the csam hashes are available for general audiences. I would assume if they are any attacker could check first and get the right amount of changes.

Alien Nathan Edward@lemm.ee · 2 years ago

One bit, in fact. Luckily there are other ways of comparing images without actually showing them to human eyes that allow you to calculate a percentage of similarity.