• keepthepace@slrpnk.net
    link
    fedilink
    arrow-up
    1
    ·
    4 days ago

    This is only true for basic pre-training of the base model. The later stage fine tuning (I used to call it RLHF but now I think many different techniques exist) is to make the model understand the basic level of expectation. Despite having 4chan in their training set, you will never see modern LLMs spontaneously generate edgy racist shit, not because it can’t but because it learnt that this is not the output expected.

    Similarly with code, base models would produce by default average code, but fine tuning makes it understand that only the highest standard has to be generated. I can guarantee you that the code LLMs produce is much higher quality (on the superficial level) than the average code on github: documentation on all functions, error code and exceptions managed correctly, special cases handled whenever they are identified…