A High Priority for Moving Away from Lemmy

Chris Remington@beehaw.org · 2 years ago

A High Priority for Moving Away from Lemmy

Gaywallet (they/it)@beehaw.org · 2 years ago

A few observations/thoughts.

There’s an awful lot of posts basically saying “this is a part of the job of moderation” and I don’t think that’s a particularly empathetic or useful observation. I’ve been on the internet and moderating for long enough to have been exposed to a lot of this, but this is not an inevitability. It’s an outcome of the system we’ve designed, of regulation and law that we have, and of not prioritizing this as a problem strongly enough. Being dismissive of an emotional experience and trauma isn’t particularly helpful.
I’m not technical enough to explain this, but there are technical and legal issues with CSAM and the lemmy platform that we’ve ran into. For one, there’s no automated scanning tools for this kind of content. My understanding is that even implementing or creating said tools would be difficult because of the way pict-rs and rust are storing images in the first place. You cannot turn off image federation, at all. At best, you can clear the content, but doing so may violate CSAM laws depending on the country and reporting requirements. Someone on the technical side can explain better than I can.
This isn’t a thread to discuss who’s to blame for CSAM. Please cease all discussions fighting about religion in the comments. I will be removing these comments.

Intelligence_Gap@beehaw.org · 2 years ago

I’m not sure that’s possible with images being allowed. If Google, Facebook, Instagram, and YouTube all struggle with it I think it will be an issue anywhere images are allowed. Maybe there’s an opening for an AI to handle the task these days but any dataset for something like that could obviously be incredibly problematic

thanevim@kbin.social · 2 years ago

Yeah, the key problem here is that any open forum, of any considerable popularity, since the dawn of the Internet has had to deal with shit like CSAM. You don’t see it elsewhere because of moderators. Doing the very job Op does. It’s just now, Op, you’re in the position. Some people can, and have decided to, deal with moderating the horrors. It may very well not be something you, Op, can do.

d3Xt3r@beehaw.org · edit-2 2 years ago

The thing is though, with traditional forums you get a LOT of controls for filtering out the kind of users who post such content. For instance, most forums won’t even let you post until you complete an interactive tutorial first (reading the rules and replying to a bot indicating you’ve understood them etc).

And then, you can have various levels of restrictions, eg, someone with less than 100 posts, or an account less than a month old may not be able to post any links or images etc. Also, you can have a trust system on some forums, where a mod can mark your account as trusted or verified, granting you further rights. You can even make it so that a manual moderator approval is required, before image posting rights are granted. In this instance, a mod would review your posting history and ensure that your posts genuinely contributed to the community and you’re unlikely to be a troll/karma farmer account etc.

So, short of accounts getting compromised/hacked, it’s very difficult to have this sort of stuff happen on a traditional forum.

I used to be a mod on a couple of popular forums back in the day, and I even ran my own community for a few years (using Invision Power Board), and never once have I had to deal with such content.

The fact is Lemmy is woefully inadequate in it’s current state to deal with such content, and there are definitely better options out there. My heart goes out to @Chris and the staff for having to deal with this stuff, and I really hope that this drives the Beehaw team to move away from Lemmy ASAP.

In the meantime, I reckon some drastic actions would need to be taken, such as disabling new user registrations and stopping all federation completely, until the new community is ready.

Thevenin@beehaw.org · 2 years ago

So this just got posted on lemmy.dbzer0. They’ve got an AI-based CSAM screen up and running with promising initial results. The model was trained using CLIP, which as far as I understand it means they used written descriptions of what CSAM is or is not.

Could something like this work for Beehaw?

Intelligence_Gap@beehaw.org · 2 years ago

I’m sure the mods saw that, and it’s really more of a question for them tbh, but if it works for other Lemmy instances I’m not sure why it wouldn’t work here.

apis@beehaw.org · 2 years ago

Wonder whether in theory one could use a dataset of… everything else, have the AI exclude what it does not recognise, then run the exclusions against a dataset to see whether or not they contain children. There could be an additional layer of running the exclusions against a dataset of regular sexual content.

One issue is that admin of any site would still want to report any CSAM to authorities. That could be automated by an AI checker, but one would have to have a lot of faith that the AI was decently accurate and not generating many false reports. The workaround I described to avoid using datasets of abuse is unlikely to be particularly accurate - ok for the purposes of protecting admin, but leaves them in an odd spot when it comes to banning a user, especially where a user’s livelihood could be impacted, or things like paid online courses. I guess specialist police departments probably would have to use highly relevant datasets, along with review by humans, but still - nobody wants to inadvertently clog up that system with false reports.

bermuda@beehaw.org · 2 years ago

I’d be fine with not hosting images entirely. I don’t think people come to beehaw primarily to look at pictures

Chobbes@beehaw.org · 2 years ago

I’ve been thinking lately that I kind of miss things like IRC where you couldn’t really post pictures in chat. With things like Discord and Slack the off topic channels often devolve into people just sharing random memes they found funny at the time, and not really talking to each other. I’m sure there’s value in that too, but I think it can take up a lot of oxygen in the social space, so I’m not sure it’s always a win. Different formats encourage different ways of interacting with each other, I guess, and it’s interesting!

liv@beehaw.org · edit-2 2 years ago

I just want to say, I am so so so sorry you had to see that.

I accidentally saw some CSAM in the 1990s and you are right, it is burnt into your mind. It’s the real limit case of “what has been seen cannot be unseen” - all I could do was learn to avoid accessing those memories.

If you can access counselling for this, that might be a good option. Vicarious trauma is a real phenomenon.

Chris Remington@beehaw.org · 2 years ago

If you can access counselling for this, that might be a good option. Vicarious trauma is a real phenomenon.

Thank you for the advice. I’m not sure that I’ll need counseling but I’m open to it if need be. Time will tell.

loops@beehaw.org · 2 years ago

Be sure to keep tabs on yourself, sometimes these things can really sneak up on you.

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 🏆@yiffit.net · edit-2 2 years ago

Sadly, the only 100% way to never have that kind of material ever touch your servers is to not allow image uploads from the public. Whether it’s on Lemmy or another social site, or something you control entirely on your own. Maybe sooner than we think, AI could deal with the moderation of it so a human never has to witness that filth, but it’s not quite there yet.

PreparaTusNalgasPorque@kbin.social · 2 years ago

I’m sure those repugnant assholes do it “for the lulz” and if they want to mess with you they’ll do it anywhere.

There’s this study that says playing Tetris helps ease recently acquired trauma https://www.ox.ac.uk/news/2017-03-28-tetris-used-prevent-post-traumatic-stress-symptoms

And the admin from his eponymous instance dbzero created an interesting script to get rid of CSAM without having to review it manually, take a look -> https://github.com/db0/lemmy-safety

leopardpuncher@beehaw.org · edit-2 2 years ago

Just tagging @admin in case they don’t see this ❤️

Edit: aaand I did it wrong 🙄 @[email protected] 👈 Better?

AndreTelevise@beehaw.org · edit-2 2 years ago

Lemm.ee, another instance I am in, isn’t hosting images anymore or letting people upload images directly due to this issue. When your platform is supposed to be 100% open source and decentralized, there are bound to be issues like this, and they should be dealt with, even if proprietary tech is necessary for it. I’m sorry to hear about this.

apis@beehaw.org · 2 years ago

So, so sorry you had to see that, and thank you for protecting the rest of us from seeing it.

On traditional forums, you’d have a lot of control over the posting of images.

If you don’t wish to block images entirely, you could block new members from uploading images, or even from sharing links. You could set things up so they’d have to earn the right to post by being active for a randomised amount of time, and have made a randomised number of posts/comments. You could add manual review to that, so that once a member has ostensibly been around long enough and participated enough, admin look at their activity pattern as well as their words to assess if they should be taken off probation or not… Members who have been inactive for a while could have image posting abilities revoked and be put through a similar probation if they return. You could totally block all members from sharing images & links via DM, and admin email accounts could be set to reject images.

It is probably possible to obtain the means to reject images which could contain any sexual content (checked against a database of sexual material which does not involve minors), and you could probably also reject images which could contain children and which might not be wholesome (checked against a database of normal images of children).

Aside from the topic in hand, a forum might decide to block all images of children, because children aren’t really in a position to consent to their images being shared online. That gets tricky when it comes to late teens & early 20s, but if you’ve successfully filtered out infants, young children, pre-teens & early teens as well as all sexual content, it is very unlikely that images of teenagers being abused would get through.

Insisting that images are not uploaded directly, but via links to image hosting sites, might give admin an extra layer of protection, as the hosting sites have their own anti-CSAM mechanisms. You’d probably want to whitelist permitted sites. You might also want a slight delay between the posting of an image link and the image appearing on Beehaw - this would allow time for the image hosting site to find & remove any problem images before they could appear on Beehaw (though I’d imagine these things are pretty damn fast by now).

You could also insist that members who wish to post images or links to images can only do so if they have their VPN and other privacy preserving methods disabled. Most members wouldn’t be super-enthused about this, until they’ve developed trust in the admin of the site, but anyone hoping to share images of children being abused or other illegal content will just go elsewhere.

Admin would probably need to be able to receive images of screenshots from members trying to report technical issues, but those should be relatively easy to whitelist with a bot of some sort? Or maybe there’s some nifty plugin for this?

Really though, blocking all images is going to be your best bet. I like the idea of just having the Beehaw bee drawings. You could possibly let us have access to a selection of avatars to pick, or have a little draw plugin so members can draw their own. On that note, those collaborative drawing plugin things can be a fun addition to a site… If someone is very keen for others to see a particular image, they can explain how to find it, or they can organise to connect with each other off Beehaw.

Storksforlegs@beehaw.org · edit-2 2 years ago

I second everything you said here

jarfil@beehaw.org · 2 years ago

block new members from uploading images

I’ve tried those methods something like 10 years ago. It didn’t work; people would pose as decent users, then suddenly switch to posting shit when allowed. I’m thinking nowadays, with the use of ChatGPT and similar, those methods would fail even more.

Modern filtering methods for images may be fine(-ish), but won’t stop NSFL and text based stuff.

Blocking VPN access, to a site intended as a safe space, seems contradictory.

anyone hoping to share […] illegal content will just go elsewhere

Like some else’s free WiFi. Wardriving is still a thing.

draw plugin so members can draw their own

That can be easily abused, either manually or through a bot. Reddit has the right idea there, where they have an avatar generator with pre-approved elements. Too bad they’re pretty stifling (and sell the interesting ones as NFTs).

pemmykins@beehaw.org · 2 years ago

I mentioned this in discord a while back, but there are image-matching databases for known instances of CSAM that you can apply for access to, as an admin of a forum or social media site. If you had access, you could scan each image uploaded or linked to in a post or comment, and compare to the database for matches. I think that mastodon is adding some hooks for this kind of checking during the upload phase, but I’m not sure what the status is with Lemmy.

I’m happy to help facilitate a solution like this, as it’s something I also care about. Feel free to find me on discord if you want to talk.

Also, as others have said - I’m sorry you had to go through that. The same thing happened to me many years ago and it definitely affected me for a long time.

Monkey With A Shell@lemmy.socdojo.com · 2 years ago

There are some automated options out there at the frontend already. One that looks simple if not absolute is putting the site through cloudflare with their csam engine. It’ll even do some of the dirty work reporting to the appropriate agency and putting a legal bar up on the link until someone can delete it.

newtraditionalists@beehaw.org · 2 years ago

Beehaw is such a special effort. I am so regretful that people have to be subjected to the darkest parts of humanity in order to protect the beehaw project. I don’t need images. If that is the necessary course of action, then so be it.

Most importantly, I am so sorry to you as one human to another. I’m sorry you saw that. I’m sorry humans are hurting each other like that. And I’m sorry that your good faith efforts have been taken advantage of.

PenguinCoder@beehaw.org · 2 years ago

Does that mean a platform that does not allow any images to be uploaded? Or a platform that has better access control and remediation controls?

Chris Remington@beehaw.org · 2 years ago

I’d be willing to consider either and would love your, particular, feedback on this as well.

flatbield@beehaw.org · edit-2 2 years ago

By the way. I have always been surprised that Beehaw did host images. The extra cost (they are large and costly in both storage and bandwidth), added security and attack vector possibilities, IP issues, CSAM issues, etc.

flatbield@beehaw.org · 2 years ago

Also, I do not think this is a Lemmy specific issue. It is an image availability, and scale issue. Federation of course increases the scale a lot too.

Scary le Poo@beehaw.org · 2 years ago

Did you forget to log into your alts or are you unaware of how the edit button functions?

Storage is super cheap, fwiw.

flatbield@beehaw.org · 2 years ago

Now be nice. Of course I know about the edit button. The comments were not posted at the same time and generally later editing is discouraged. Nor are long comments or one comment on different topics great.

Why on earth would I have multiple accounts? I am sure people do, but that too is kind of strange behavior and perhaps abusive depending on how they are used.

flatbield@beehaw.org · 2 years ago

Not as cheap as you think at scale and your renting the bandwidth and space from a hosting company and most of the users are probably free loading. The whole challenge of FOSS and services is that there is no one to pay operating costs.

flatbield@beehaw.org · edit-2 2 years ago

I think if a platform has image capabilities this is to be expected. I guess the only exception if there are filters that can be used, but this seems unlikely. So I think it is an image vs. no image decision. The other problem with images is they can be attack vectors from a security point of view. Any complex file format can be an attack vector as interpreters of complex file formats often have bugs.

Can you imagine that the large platforms have whole teams of people that have to look at this stuff all day and filter it out. Not sure how that works, but it is probably the reality. Notice R$ never hosted images.

forestG@beehaw.org · edit-2 2 years ago

I don’t think there is a way to have both the option to host images and have zero risk of getting such image uploads. You either completely disable image hosting, or you mitigate the risk by the way image uploads are handled. Even if you completely disable the image uploads, someone might still link to such content. The way I see this there are two different aspects. One is the legal danger you place yourself when you open your instance to host images uploaded by users. The other is the obvious (and not so obvious) and undeniable harmful effects contact with such material has for most of us. The second, is pretty impossible to guarantee 100% on the internet. The first you can achieve by simply not allowing image uploads (and I guess de-federating with other instances to avoid content replication).

The thing is, when you host an instance of a technology that allows for better moderation (i.e. allowing certain kinds of content, such as images, only after a user reaches a certain threshold of activity), actually helps in a less obvious manner. CSAM is not only illegal to exist on the server-side. It’s also illegal and has serious consequences for the people who actually upload it. The more activity history you have on a potential uploader, the easier it becomes to actually track him. Requiring more time for an account before allowing it to post images, makes concealing the identity harder and raises the potential risk for the uploader to the extend that it will be very difficult to go through the process only to cause problems to the community.

Let me also state this clearly: I don’t have an issue with disabling image uploads here, or changing the default setting of instance federation to a more limiting one. Or both. I don’t mind linked images to external sites.

I am sorry you had to see such content. No, it doesn’t seem to go away. At least it hasn’t for me, after almost 2 decades :-/

Kajo [he/him] 🌈@beehaw.org · edit-2 2 years ago

First of all, I’m so sorry that you have been exposed to such horrors. I hope you can handle that, or find help to.

I don’t have a solution, I’d just like to share some thoughts.

Some people suggested that AIs could detect this kind of content. I would be reluctant to use such tools, because lots of AI projects exploit unprotected workers in poor countries for data labeling.
An zero-image policy could be an effective solution, but it would badly impact @[email protected], @[email protected] and @[email protected].
correct me if I’m wrong, but on the fediverse, when a picture is posted on an instance, it is duplicated on all federated instances? If I’m right, it means that even if beehaw found a way to totally avoid CSAM posting, you could still end up with duplicated CSAM on your server? (with consequences on your mental health, and possibly legal risks for owning such pictures)

jarfil@beehaw.org · 2 years ago

correct me if I’m wrong, but on the fediverse, when a picture is posted on an instance, it is duplicated on all federated instances?

Kind of. It duplicates on all instances that subscribe to the community where it was posted to. Behind the scenes, Lemmy makes each community a “user” that boosts everything posted to that community. That content, is only getting pushed to instances where at least one user has subscribed to that community/“user”, then any included images get cached. So if nobody subscribes to a federated instance’a community, none of the content gets duplicated.

The biggest problem right now are users with “burner accounts” who exploit instances with free-for-all registrations, to push content to communities that have subscribers from as many different instances as possible… possibly “lurker” accounts created by the same attacker just to subscribe to the remote community they’re attacking and have the content show in the default “All” feed of all instances.

There are some possible countermeasures for that:

Defederate from any instance with “free for all” registrations
Remove “lurker” accounts who only subscribe to non-local communities, particularly if they’re the only subscriber for those communties
Limit the “All” feed, definitely DO NOT show it as the default for anonymous users (like on the web). Ideally, admins should be able to choose what to show in there, even from their own instance.
Run some image ID, AI, or other filtering on the content

Storksforlegs@beehaw.org · 2 years ago

As others have suggested, I think temporarily suspending images until you guys can settle on a safe alternative to lemmy is a good idea.

Im sorry you had to see something like this, i hope you are able to seek out some counceling asap, talk to someone about it. Even something like https://www.7cups.com/ might be helpful.

nlm@beehaw.org · 2 years ago

Sorry to hear that mate! That’s one of the biggest reasons I’ve never wanted to move towards IT forensics even though I think I’d enjoy the actual work. But having to regularly sift through the absolute worst humanity has to offer sounds awful.

Hope the immediate pain of it settles as soon as possible!

This might not be what people want but since beehaw is going to leave Lemmy anyway, couldn’t you just completely defederate and run as an isolated instance? Then you’d have control of what her life gets published without having to deal with federated nastiness?