• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    4 hours ago

    Most aren’t really running Deepseek locally. What ollama advertises (and basically lies about) is the now-obselete Qwen 2.5 distillations.

    …I mean, some are, but it’s exclusively lunatics with EPYC homelab servers, heh. And they are not using ollama.

    • DandomRude@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      Thx for clarifying.

      I once tried a community version from huggingface (distilled), which worked quite well even on modest hardware. But that was a while ago. Unfortunately, I haven’t had much time to look into this stuff lately, but I wanted to check that again at some point.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 hour ago

        Also, I’m a quant cooker myself. Say the word, and I can upload an IK quant more specifically tailored for whatever your hardware/aim is.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 hour ago

        You can run GLM Air on pretty much any gaming desktop with 48GB+ of RAM. Check out ubergarm’s ik_llama.cpp quants on Huggingface; that’s state of the art right now.