• humanspiral@lemmy.ca
    link
    fedilink
    arrow-up
    1
    arrow-down
    5
    ·
    2 days ago

    I wouldn’t bet too hard against NVIDIA. Sure their margins/extortion pricing can go down with fewer customers, but LLMs are here to stay. Datacenter and mediocre models (Open AI) and getting good ROI from buying from NVIDIA is what is nearly impossible. US models tend to all concentrate on megalithic US military Skynet ambitions, and every release is a step towards Skynet. Open models, mostly from China, tend to be smaller (in GPU/memory requirements) but have better quality/cost ratios including use on accessible (non datacenter) hardware.

    It’s the datacenter gpu customers, and mediocre software/llms renting/owning them, that are the huge risk. At the same time, US empire bankster allies will invest for Skynet.

    • Norah (pup/it/she)@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      6
      ·
      2 days ago

      LLMs don’t benefit from economies of scale. Usually, each successive generation of a technology is cheaper to produce, or stays the same but with much greater efficiency/power/efficacy/etc. For LLMs, each successive generation costs much more to produce for lesser and lesser benefits.

      • humanspiral@lemmy.ca
        link
        fedilink
        arrow-up
        1
        ·
        1 day ago

        LLMs don’t benefit from economies of scale.

        For training, compute and memory scale does matter, including networked large scale clusters (of GPUs). No money is made in training. Inference (where money is made/charged or benefits obtained), memory more important, but compute still extremely important. At Skynet level, models over 512gb are used. But consumer level, and every level smaller models are much faster. 16gb, 24gb, 32gb, 96gb, 128gb, and 512gb are each somewhat approachable. But each of these thresholds are some version of scale.

        each successive generation of a technology is cheaper to produce, or stays the same but with much greater efficiency/power/efficacy/etc.

        The roadmaps for GPU makers are, well for nvidia only for simplicity, Rubin will have 5 times the bandwidth, double the memory and at least double the compute. For what is likely 2x the cost, less than 2x the power. A big issue with bubble status is a fairly sharp depreciation in existing leading edge devices. Bigger memory alone is always a faster overall solution than networking/connections.

        For LLMs, each successive generation costs much more to produce for lesser and lesser benefits.

        Bigger parameter models are slower for same training data sets than smaller parameter models. Skynet ambitions do involve ever larger parameters, and sure more training data is added rather than any removed. There is innovation in generations on the smaller/efficiency side too, though Skynet funding is for the former.

    • bobalot@lemmy.worldOP
      link
      fedilink
      arrow-up
      8
      ·
      2 days ago

      None of the AI providers actually make a profit on their searches and the marginal cost per user and search isn’t dropping.