I assume they all crib from the same training sets, but surely one of the billion dollar companies behind them can make their own?

  • Mugita Sokio@lemmy.today
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    2
    ·
    3 days ago

    This is due to the training sets, one of them being CommonCrawl, which is disgusting. The Chinese LLMs like DeepSeek R1 and Qwen 3 use a different set of training materials that was actually good, despite it being censored too.