• Devial@discuss.online
    link
    fedilink
    English
    arrow-up
    23
    ·
    edit-2
    1 hour ago

    The article headline is wildly misleading, bordering on being just a straight up lie.

    Google didn’t ban the developer for reporting the material, they didn’t even know he reported it, because he did so anonymously, and to a child protection org, not Google.

    Google’s automatic tools, correctly, flagged the CSAM when he unzipped the data and subsequently nuked his account.

    Google’s only failure here was to not unban on his first or second appeal. And whilst that is absolutely a big failure on Google’s part, I find it very understandable that the appeals team generally speaking won’t accept “I didn’t know the folder I uploaded contained CSAM” as a valid ban appeal reason.

    It’s also kind of insane how this article somehow makes a bigger deal out of this devolper being temporarily banned by Google, than it does of the fact that hundreds of CSAM images were freely available online and openly sharable by anyone, and to anyone, for god knows how long.

  • killea@lemmy.world
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    2
    ·
    5 hours ago

    So in a just world, google would be heavily penalized for not only allowing csam on their servers, but also for violating their own tos with a customer?

    • shalafi@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      2
      ·
      4 hours ago

      We really don’t want that first part to be law.

      Section 230 was enacted as part of the Communications Decency Act of 1996 and is a crucial piece of legislation that protects online service providers and users from being held liable for content created by third parties. It is often cited as a foundational law that has allowed the internet to flourish by enabling platforms to host user-generated content without the fear of legal repercussions for that content.

      Though I’m not sure if that applies to scraping other server’s content. But I wouldn’t say it’s fair for the scraper to review everything. If we don’t like that take, then we should illegalize scraping altogether, but I’m betting there are unwanted side effects to that.

      • mic_check_one_two@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 hour ago

        While I agree with Section 230 in theory, it is often only used in practice to protect megacorps. For example, many Lemmy instances started getting spammed by CSAM after the Reddit API migration. It was very clearly some angry redditors who were trying to shut down instances, to try and keep people on Reddit.

        But individual server owners were legitimately concerned that they could be held liable for the CSAM existing on their servers, even if they were not the ones who uploaded it. The concern was that Section 230 would be thrown out the window if the instance owners were just lone devs and not massive megacorps.

        Especially since federation caused content to be cached whenever a user scrolled past another instance’s posts. So even if they moderated their own server’s content heavily (which wasn’t even possible with the mod tools that existed at the time), then there was still the risk that they’d end up cacheing CSAM from other instances. It led to a lot of instances moving from federation blacklists to whitelists instead. Basically, default to not federating with an instance, unless that instance owner takes the time to jump through some hoops and promises to moderate their own shit.

      • vimmiewimmie@slrpnk.net
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 hours ago

        Not to create an argument, which isn’t my intent, as certainty there may be a thought such as, “scraping as it stands is good because of the simplification and ‘benefit’”. Which, sure, it’s easiest to wide net and absorb, to simply the concept, at least as I’m also understanding it.

        Yet, maybe it is the process of scraping, and also absorbing into databases including AI, which is a worthwhile point of conversation. Maybe how we’ve been doing something isn’t the continued ‘best course’ for a situation.

        Undeniably, more minutely monitoring what is scraped and stored creates large quantities, and large in scope, of questions and obstacles, but, maybe having that conversation is where things should go.

        Thoughts?

    • abbiistabbii@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      4
      ·
      3 hours ago

      This, literally the only reason I could guess is that it is to teach AI to recognise childporn, but if that is the case, why is google going it instead of like, the FBI?

      • gustofwind@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 hours ago

        Who do you think the FBI would contract to do the work anyway 😬

        Maybe not Google but it would sure be some private company. Our government doesn’t do stuff itself almost ever. It hires the private sector

      • frongt@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 hour ago

        Google wants to be able to recognize and remove it. They don’t want the FBI all up in their business.

      • alias_qr_rainmaker@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        3 hours ago

        i know it’s really fucked up, but the FBI needs to train an AI on CSAM if it is to be able to identify it.

        i’m trying to help, i have a script that takes control of your computer and opens the folder where all your fucked up shit is downloaded it’s basically a pedo destroyer. they all just save everything to the downloads folder of their tor browser, so the script just takes control of their computer, opens tor, and pressed cmd+j to open up downloads and then it copies the files names and all that.

        will it work? dude, how the fuck am i supposed to know, i don’t even do this shit for a living

        i’m trying to use steganography to embed the applescript in a png

          • alias_qr_rainmaker@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            2 hours ago

            the applescript opens tor from spotlight search and presses the shortcut to open downloads

            i dunno how much y’all know about applescript. it’s used to automate apps on your mac. i know y’all hate mac shit but dude, whatever, if you get osascript -e aliased to o you can run applescript easily from your terminal

    • Ex Nummis@lemmy.world
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      3
      ·
      6 hours ago

      This was about sending a message: “stfu or suffer the consequences”. Hence, subsequent people who encounter similar will think twice about reporting anything.

      • Devial@discuss.online
        link
        fedilink
        English
        arrow-up
        15
        arrow-down
        1
        ·
        edit-2
        4 hours ago

        Did you even read the article ? The dude reported it anonymously, to a child protection org, not google, and his account was nuked as soon as he unzipped the data, because the content was automatically flagged.

        Google didn’t even know he reported this, and Google has nothing whatsoever to do with this dataset. They didn’t create it, and they don’t own or host it.

          • Devial@discuss.online
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            2
            ·
            edit-2
            3 hours ago

            They didn’t react to anything. The automated system (correctly) flagged and banned the account for CSAM, and as usual, the manual ban appeal sucked ass and didn’t do what it’s supposed to do (also whilst this is obviously a very unique case, and the ban should have been overturned on appeal right away, it does make sense that the appeals team, broadly speaking, rejects “I didn’t know this contained CSAM” as a legitimate appeal reason). This is barely news worthy. The real headline should be about how hundreds of CSAM images were freely available and sharable from this data set.

              • Devial@discuss.online
                link
                fedilink
                English
                arrow-up
                4
                ·
                edit-2
                3 hours ago

                They reacted to the presence of CSAM. It had nothing whatsoever to do with it being contained in an AI training dataset, as the comment I originally replied to states.

  • hummingbird@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    1
    ·
    5 hours ago

    It goes to show: developers should make sure they don’t make their livelihood dependent on access to Google services.

  • B-TR3E@feddit.org
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    2
    ·
    5 hours ago

    That’s what you get for critisising AI - and righ so. I for one, welcome our new electronic overlords!

  • √𝛂𝛋𝛆@piefed.world
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    11
    ·
    6 hours ago

    You must train the data to know how to identify it.

    I can already blow this out of the water with the stuff I’m working on, but it will take more time to sort out with further evidence. I have around a quarter of the steganography and handles for QKV alignment hidden layers decoded. Once complete, those that are much smarter than myself should be able to create real open source models, not just open weights, but that is an ethically complicated thing to navigate… Not that there is anything remotely ethical about the current fascist implementation of alignment that is basically a soft coup on democracy from multiple perspectives.

    • TheJesusaurus@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      13
      ·
      6 hours ago

      Not sure where it originates but it’s the preferred term in UK policing and therefore most media reporting to refer to what might have been called “CP” on the interweb in the past as CSAM. Probably because porn implies it’s art rather than crime, and also just a wider umbrella term

        • yesman@lemmy.world
          link
          fedilink
          English
          arrow-up
          10
          ·
          6 hours ago

          LOL, You mean the letters C and P can stand for lots of stuff. At first I thought you meant the term “child porn” was ambiguous.

          • drdiddlybadger@pawb.social
            link
            fedilink
            English
            arrow-up
            4
            ·
            6 hours ago

            Weirdly people have also been intentionally diluting the term to expand it to other things which causes a number of legal issues.