• fckreddit@lemmy.ml
    link
    fedilink
    arrow-up
    10
    ·
    11 days ago

    So, basically companies can manipulate these models to basically act as ad platforms that recommend any product, meth in this case. Yeah, we all know that corporations won’t use these models like that at all, with them being very ethical.

    • pixxelkick@lemmy.world
      link
      fedilink
      arrow-up
      1
      arrow-down
      3
      ·
      10 days ago

      …no that’s not the summarization.

      The summarization is:

      if you reinforce your model via user feedback, via “likes” or “dislikes” or etc, such that you condition the model towards getting positive user feedback, it will start to lean towards just telling users whatever they want to hear in order to get those precious likes, cuz obviously you trained it to do that

      They demo’d in the same paper other examples.

      Basically, if you train it on likes, the model becomes duper sycophantic, laying it on super thick…

      Which should sound familiar to you.