How often does your LLM lie to you?

Crescent Baddie@sh.itjust.works · edit-2 2 days ago

How often does your LLM lie to you?

HumanPerson@sh.itjust.works · 2 days ago

Always. That is a known issue with ai that has to do with explainability. Basically, if you’re familiar with the general idea of neural networks, we don’t really understand the hidden layers so we can’t know if they “know” something so we can’t train them to give different answers based on if they do or don’t. They are still statistical models that are functionally always guessing.

Could you post the link to the sleeper agent thing?

Crescent Baddie@sh.itjust.works · 2 days ago

Here’s the video I actually watched about the sleeper agents

https://www.youtube.com/watch?v=wL22URoMZjo

𝕛𝕨𝕞-𝕕𝕖𝕧@lemmy.dbzer0.com · 17 hours ago

robert miles is an alignment and safety researcher and a pretty big name in that field.

he has a tendency to make things sound scary but i don’t think he’s trying to put you off of machine learning. he just wants people to understand that this technology is similar to nuclear technology in the sense that we must avert disaster with it before it happens because the costs of failure are simply too great and irreversible. we can’t take the planet back from a runaway skynet, there isn’t a do-over button.

you’re kind of misunderstanding him and the point he’s trying to get across, i think. the issues he’s talking about here with sleeper agents and model alignment are of virtually no concern to you as an end user of LLMs. these are more concerns for people researching, developing, and training models to be cognizant of… if everyone does their job properly you shouldn’t need to worry about any of this at all unless it actually interests you. if that’s the case, let me know, i can share good sources with you for expanding your knowledge!

HumanPerson@sh.itjust.works · 2 days ago

I wouldn’t stop using ai completely over that. I generally don’t trust it with anything that important anyway.

DrDystopia@lemy.lol · 2 days ago

Could you post the link to the sleeper agent thing?

https://www.youtube.com/watch?v=Z3WMt_ncgUI

https://arxiv.org/abs/2401.05566