AMD prepares Ryzen 7 9850X3D and Ryzen 9 9950X3D2 CPUs with higher clocks and full 3D V-Cache on all cores. See what improvements are coming.

  • Glitchvid@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    23 hours ago

    Honestly these don’t make much sense as a consumer product. If your game is doing latency sensitive work across the CCDs then you’re wasting the V-Cache, the only real way to see a benefit would be running two separate games…

    Which is the kind of thing you would do on game server — so this seems ideal as an EPYC 4005 product instead.

    • fonix232@fedia.io
      link
      fedilink
      arrow-up
      3
      ·
      20 hours ago

      It actually makes some sense.

      On my 7950X3D setup the main issue was always making sure to pin games to a specific CCD, and AMDs tooling is… quite crap at that aspect. Identifying the right CCD was always problematic for me.

      Eliminating this by adding V-Cache to both CCDs so it doesn’t matter which one you pin it to is a good workaround. And IIRC V-Cache also helps certain (local) AI workflows as well, meaning running a game next to such a model won’t cause issues, as both gets its own CCD to run on.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        14 hours ago

        V-Cache also helps certain (local) AI workflows as well

        I’m not aware of any, other than absolutely tiny embeddings models, maybe. Local ML stuff is usually compute or RAM bandwidth limited, and doesn’t really fit in expanded L3.

        AV1 encoding does love V-cache, last I checked. And like you said, it’s potentially good for ‘conserving’ RAM bandwidth in mixed scenarios, though keep in mind that the CCDs can access each other’s L3.

        • fonix232@fedia.io
          link
          fedilink
          arrow-up
          2
          ·
          9 hours ago

          AI workflows aren’t limited to LLMs you know.

          For example, TTS and STT models are usually small enough (15-30MB) to be loaded directly into V-cache. I was thinking of such small scale local models, especially when you consider AMD’s recent forays into providing a mixed environment runtime for their hardware (GAIA framework that can dynamically run your ML models on CPU, NPU and GPU, all automagically)

            • fonix232@fedia.io
              link
              fedilink
              arrow-up
              2
              ·
              8 hours ago

              No worries mate, we can’t all be experts of every field and every topic!

              Besides there are other AI models that are relatively small and depend on processing power more than RAM. For example there’s a bunch of audio analysis tools that don’t just transcribe information but also diarise it (split it up by speaker), extract emotional metadata (e.g. certain models can detect sarcasm quite well, others spot general emotions like happiness or sadness or anger), and so on. Image categorisation models are also super tiny, though usually you’d want to load them into the DSP-connected NPU of appropriate hardware (e.g. a newer model “smart” CCTV camera would be using a SoC that has NPU to load detection models into, and do the processing for detecting people, cars, animals, etc. onboard instead of on your NVR).

              Also by my count, even somewhat larger training systems such as micro wakeword training, would fit into the 196MB V-Cache.

              • brucethemoose@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                8 hours ago

                Exactly! Not my area of expertise, heh.

                There might even be niches in LLM land, like mamba SSM states, really tiny draft models, or other “cache” type things fitting into so much L3. This might already be the case with EPYC/TR stuff some homelab folks use.

                It makes me wonder if the old AMD 6800 XT (with its 128MB of cache) would be good at this sort of “small model” thing.

    • UnfortunateShort@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      22 hours ago

      I wonder if we will ever figure out something to do with computers other than gaming. Then we would finally have a use for the countless > 8 core CPUs released for some weird resson

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        14 hours ago

        I mean, big vcache CPUs are already great for max-quality encoding, media processing, niche renderig and some other workloads. Software dev loves lots of cores.

        I’d certainly like a bigger CPU for hybrid inference too.

        Folks run all sorts of weird stuff on CPUs.

      • Glitchvid@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        21 hours ago

        Two V-Cache CCDs just don’t make sense for consumer use-cases is the point here.

        Because of the latency topology you’d only see a benefit if both CCDs were running independent programs that are more strictly sensitive to latency than compute throughput. That’s a very niche subset of uses, and like I said, ideal for a GSP deployment. This should be a product, but in the EPYC 4005 family.

        Otherwise, you’re better off with some sort of heterogeneous topology, one can imagine an 8 core V-Cache CCD paired with a 16 c-core CCD design (this is very roughly what Intel is pursuing with upcoming products) which offers compelling utility even in the consumer space.

        • UnfortunateShort@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 hours ago

          That or you run properly designed software. Pinning threads to CPUs and separating data isn’t magic. Claiming “this is useless for [extremely broad group of people]” is just a weird and ignorant statement. There are literally people running servers at home.

          • Glitchvid@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            47 minutes ago

            There are literally people running servers at home.

            Which is why I’ve said three times now it should be an EPYC 4005 product.