Industry researchers dispute DeepSeek’s unusually low training cost claims

  • danglybits27@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    3
    ·
    13 days ago

    More clarification in this article:

    https://www.theregister.com/2025/09/19/deepseek_cost_train/

    "But, that’s not actually what happened. Never mind the fact that $300,000 won’t buy you anywhere close to 512 H800s (those estimates are based on GPU lease rates not actual hardware costs), the researchers aren’t talking about end-to-end model training.

    Instead, it focuses on the application of reinforcement learning used to imbue its existing V3 base model with “reasoning” or “thinking” capabilities.

    In other words, they’d already already done about 95 percent of the work by the time they’d reached the RL phase detailed in this paper."

  • 87Six@lemmy.zip
    link
    fedilink
    English
    arrow-up
    2
    ·
    13 days ago

    GPU smuggling that’s how. Not sure they would include the smuggled GPU’s into the numbers.

    • 87Six@lemmy.zip
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      8 days ago

      Yup. No H200 and A100 and such mentioned, only H800 which were only recently banned in 2023. That’s why the cost was low. Part of the GPU pool used were illegally smuggled and not counted.