• NuXCOM_90Percent@lemmy.zip
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 day ago

    There is no such thing as an “ongoing ‘clean’ data source”. Because the people paying for these models are actively using it on those platforms. Whether it is because they are too stupid to express their own thoughts and “use chatgpt for coming up with ideas” or because they are running a bot farm.

    And the reality is that… it doesn’t actually matter. People are on the “dead internet theory” bandwagon again. But… think back to the past five or six years. How often have you had a meaningful conversation on ANY social media platform? It happens occasionally, but mostly you can hope for someone to key in on a single sentence you said or actively misrepresent your post because they wanted to tell a joke they clearly have been workshopping for the past few days. More often you get someone who just keys in on a single word and pastes a copypasta that they know will do well (see: Basically any topic about “AI” on the fediverse…). And that is assuming you get a reply at all.

    Let alone all the people who pretend the internet used to be a wonderful place where you get the answer to everything instantly rather than worrying if chatgpt hallucinated. Those motha fuckers clearly never spent much time on stack overflow or the intel message boards…

    “AI” didn’t learn how to pass the Voight-Kampff test. Humanity now does so poorly on it that we have to grade on a curve.

    So as for “sloppy” models? We are probably less than a year out before influencers start bragging that they love ShitGPT 7 because “it isn’t pretentious. It talks like me”. Sorry “Others disingenuous. ShitGPT speak truth gooder. Like, comment, subscribe and use my affilly linkle”

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 day ago

      I mean, fair. You’re preaching to the choir there. People look back on the ‘old’ internet with some incredible rose-tinted glasses, 100%.

      So as for “sloppy” models? We are probably less than a year out before influencers start bragging that they love ShitGPT 7 because “it isn’t pretentious. It talks like me”. Sorry “Others disingenuous. ShitGPT speak truth gooder. Like, comment, subscribe and use my affilly linkle”

      Have you seen the AI acolyte youtubers? Or /r/OpenAI? We’re already there, heh, and it’s even weirder than that.

      …That being said, there is an earnest interest in non “sloppy” models and training. For instance, there’s this longrunning thread, trying to dig through old releases and find the one that’s least deep-fried (as they are increasingly getting): https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0/discussions/15#6910bfd226329b755d084c69

      Or efforts to objectively measure slop, and create a slop ‘taxonomy’ tree from all the models training on each other: https://eqbench.com/creative_writing.html

      img