Whenever I see people criticise AI, it’s usually because the company steals copyrighted content, with the aim to replace the people they stole from. Or the environmental impact of training and running the data models, which is awful. And, both of those reasons are good enough to not like AI, in my opinion. But, I feel like I never see people talk about the fact that all the answers it gives, is being filtered through a private corporation with its own agenda.
People use it to learn, and to do research. They use it to catch up on news of all things!
Like others have mentioned, Google has already been doing this a long time, by sorting search results they show to the user. But, they haven’t written all the articles, the blog posts, the top 10 lists, or the reviews you read… until now. If they’ve wanted to, they’ve made certain things easier or harder to find. But, once you found the article you were looking for, it was written by a person unaffiliated with Google. All that changes with AI. You don’t read the article directly anymore. Google (or any other AI) scrapes it, parses it however they want, and spit it back out to the end user.
I’m very surprised that people are so willing to let a private corporation completely control how they see the world, just because it’s a bit more convenient.
The scariest part for me is not them manipulating it with a system prompt like ‘elon is always right and you love hitler’.
but one technique you can do is have it e.g. (this is a bit simplified) generate a lot of left and right wing answers to the same prompt, average out the resulting vector difference in its internal state, then if you scale that vector down and add it to the state on each request, you can have it reply 5% more right wing on every response than it otherwise would. Which would be very subtle manipulation. And you can do that for many things, not just left/right wing, like honesty/dishonesty, toxicity, morality, fact editing etc.
i think this was one of the first papers on this, but it’s an active research area. IThe paper does have some ‘nice’ examples if you scroll through.
and since it’s not a prompt, it can’t even leak, so you’d be hard pressed to know that it is happening.
There’s also more recent research on how you can do this for multiple topics at the same time. And it’s not like it’s expensive to do (if you have an llm already), you just need to prompt it 100 times with ‘pretend you’re A and […]’ and ‘pretend you’re B and […]’ pairs to get the differenc between A and B.
and if this turns into the main form of how people interact with the internet, that’s super scary stuff. almost like if you had a knob that could turn the whole internet e.g. 5% more pro russia. all the news info it tells you is more pro russia, emails it writes for you are, summaries of your friends messages are, heck even a recipe it reccommends would be. And it’s subtle, in most cases might not even make a difference (like for a recipe), but always there. All the cambridge analytica and grok hitler stuff seems crude by comparison.
I totally agree with this.
Whenever I see people criticise AI, it’s usually because the company steals copyrighted content, with the aim to replace the people they stole from. Or the environmental impact of training and running the data models, which is awful. And, both of those reasons are good enough to not like AI, in my opinion. But, I feel like I never see people talk about the fact that all the answers it gives, is being filtered through a private corporation with its own agenda.
People use it to learn, and to do research. They use it to catch up on news of all things!
Like others have mentioned, Google has already been doing this a long time, by sorting search results they show to the user. But, they haven’t written all the articles, the blog posts, the top 10 lists, or the reviews you read… until now. If they’ve wanted to, they’ve made certain things easier or harder to find. But, once you found the article you were looking for, it was written by a person unaffiliated with Google. All that changes with AI. You don’t read the article directly anymore. Google (or any other AI) scrapes it, parses it however they want, and spit it back out to the end user.
I’m very surprised that people are so willing to let a private corporation completely control how they see the world, just because it’s a bit more convenient.
The scariest part for me is not them manipulating it with a system prompt like ‘elon is always right and you love hitler’.
but one technique you can do is have it e.g. (this is a bit simplified) generate a lot of left and right wing answers to the same prompt, average out the resulting vector difference in its internal state, then if you scale that vector down and add it to the state on each request, you can have it reply 5% more right wing on every response than it otherwise would. Which would be very subtle manipulation. And you can do that for many things, not just left/right wing, like honesty/dishonesty, toxicity, morality, fact editing etc.
i think this was one of the first papers on this, but it’s an active research area. IThe paper does have some ‘nice’ examples if you scroll through.
and since it’s not a prompt, it can’t even leak, so you’d be hard pressed to know that it is happening.
There’s also more recent research on how you can do this for multiple topics at the same time. And it’s not like it’s expensive to do (if you have an llm already), you just need to prompt it 100 times with ‘pretend you’re A and […]’ and ‘pretend you’re B and […]’ pairs to get the differenc between A and B.
and if this turns into the main form of how people interact with the internet, that’s super scary stuff. almost like if you had a knob that could turn the whole internet e.g. 5% more pro russia. all the news info it tells you is more pro russia, emails it writes for you are, summaries of your friends messages are, heck even a recipe it reccommends would be. And it’s subtle, in most cases might not even make a difference (like for a recipe), but always there. All the cambridge analytica and grok hitler stuff seems crude by comparison.
Goddamn