• vrighter@discuss.tchncs.de
    link
    fedilink
    arrow-up
    9
    arrow-down
    1
    ·
    14 hours ago

    when you use reinforcement learning to punish the ai for saying “the sky is magenta”, you’re training it to “don’t say the sky is magenta”. You’re not training it to “don’t lie”. What about the infinite other ways the answer could be wrong though?

    • 🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 @pawb.social
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      edit-2
      3 hours ago

      What about the infinite other ways the answer could be wrong though?

      You figure out rules for those, too. And every time it starts doing something you don’t want it to. Nobody said it would be easy. But it is fucking possible. It has literally been done. Even the article in this post says it can be done, in contrast to its ridiculous headline.

      But it won’t be creating full on AGI in 5 years as dipshits like Altman keep claiming. It’ll take much, much longer.