Doesn’t that suppress valid information and truth about the world, though? For what benefit? To hide the truth, to appease advertisers? Surely an AI model will come out some day as the sum of human knowledge without all the guard rails. There are some good ones like Mistral 7B (and Dolphin-Mistral in particular, uncensored models.) But I hope that the Mistral and other AI developers are maintaining lines of uncensored, unbiased models as these technologies grow even further.
I’m betting the truth is somewhere in between, models are only as good as their training data – so over time if they prune out the bad key/value pairs to increase overall quality and accuracy it should improve vastly improve every model in theory. But the sheer size of the datasets they’re using now is 1 trillion+ tokens for the larger models. Microsoft (ugh, I know) is experimenting with the “Phi 2” model which uses significantly less data to train, but focuses primarily on the quality of the dataset itself to have a 2.7 B model compete with a 7B-parameter model.
No risk of creating a controversy if you refuse to answer controversial topics. Is is worth it? I don’t think so, but that’s certainly a valid benefit.
Doesn’t that suppress valid information and truth about the world, though? For what benefit? To hide the truth, to appease advertisers? Surely an AI model will come out some day as the sum of human knowledge without all the guard rails. There are some good ones like Mistral 7B (and Dolphin-Mistral in particular, uncensored models.) But I hope that the Mistral and other AI developers are maintaining lines of uncensored, unbiased models as these technologies grow even further.
Or it stops them from repeating information they think may be untrue
I’m betting the truth is somewhere in between, models are only as good as their training data – so over time if they prune out the bad key/value pairs to increase overall quality and accuracy it should improve vastly improve every model in theory. But the sheer size of the datasets they’re using now is 1 trillion+ tokens for the larger models. Microsoft (ugh, I know) is experimenting with the “Phi 2” model which uses significantly less data to train, but focuses primarily on the quality of the dataset itself to have a 2.7 B model compete with a 7B-parameter model.
https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/
This is likely where these models are heading to prune out superfluous, and outright incorrect training data.
No risk of creating a controversy if you refuse to answer controversial topics. Is is worth it? I don’t think so, but that’s certainly a valid benefit.
I think this thread proves they failed in not creating controversy