Mine attempts to lie whenever it can if it doesn’t know something. I will call it out and say that is a lie and it will say “you are absolutely correct” tf.

I was reading into sleeper agents placed inside local LLMs and this is increasing the chance I’ll delete it forever. Which is a shame because it is the new search engine seeing how they ruined search engines

  • HumanPerson@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 days ago

    Always. That is a known issue with ai that has to do with explainability. Basically, if you’re familiar with the general idea of neural networks, we don’t really understand the hidden layers so we can’t know if they “know” something so we can’t train them to give different answers based on if they do or don’t. They are still statistical models that are functionally always guessing.

    Could you post the link to the sleeper agent thing?

      • 𝕛𝕨𝕞-𝕕𝕖𝕧@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        3
        ·
        17 hours ago

        robert miles is an alignment and safety researcher and a pretty big name in that field.

        he has a tendency to make things sound scary but i don’t think he’s trying to put you off of machine learning. he just wants people to understand that this technology is similar to nuclear technology in the sense that we must avert disaster with it before it happens because the costs of failure are simply too great and irreversible. we can’t take the planet back from a runaway skynet, there isn’t a do-over button.

        you’re kind of misunderstanding him and the point he’s trying to get across, i think. the issues he’s talking about here with sleeper agents and model alignment are of virtually no concern to you as an end user of LLMs. these are more concerns for people researching, developing, and training models to be cognizant of… if everyone does their job properly you shouldn’t need to worry about any of this at all unless it actually interests you. if that’s the case, let me know, i can share good sources with you for expanding your knowledge!