Total noob to this space, correct me if I’m wrong. I’m looking at getting new hardware for inference and I’m open to AMD, NVIDIA or even Apple Silicon.

It feels like consumer hardware comparatively gives you more value generating images than trying to run chatbots. Like, the models you can run at home are just dumb to talk to. But they can generate images of comparable quality to online services if you’re willing to wait a bit longer.

Like, GPT OSS 120b, assuming you can spare 80GB of memory, is still not GPT 5. But Flux Shnell is still Flux Shnel, right? So if diffusion is the thing, NVIDIA wins right now.

Other options might even be better for other uses, but chatbots are comparatively hard to justify. Maybe for more specific cases like code completion with zero latency or building a voice assistant, I guess.

Am I too off the mark?

  • afk_strats@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    23 hours ago

    It depends on your goals and your use case.

    • Do you want the most performance per dollar?You will never touch what the big datacenters can achieve.

    • Do you want privacy? Buy it yourself?

    • Do you want quality output? Go to the online providers or expect to pay more to build it yourself.

    I am actively trying to work on non-Nvidia hardware because I’m a techno-masochist. It’s very uphill especially at the cutting edge. People are building for CUDA.

    I can do amazing image generation on a 7900xtx with 24gb of vram. One of those is under 900 in the US which is great. A 3090 would probably be easier and is more expensive although it’s less performant hardware

    • rkd@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      22 hours ago

      I believe right now it’s also valid to ditch NVIDIA given a certain budget. Let’s see what can be done with large unified memory and maybe things will be different by the end of the year.