• 0 Posts
  • 5 Comments
Joined 1 year ago
cake
Cake day: August 6th, 2023

help-circle
  • That’s the solution I take. I use Proxmox for a Windows VM which runs Ollama. That VM can then be used for gaming in the off chance a LLM isn’t loaded. It usually is. I use only one 3090 due to the power load of my two servers on top of my [many] HDDs. The extra load of 2 isn’t something I want to worry about.

    I point to that machine through LiteLLM* which is then accessed through nginx which allows only Local IPs. Those two are in a different VM that hosts most of my docker containers.

    *I found using Ollama and Open WebUI causes the model to get unloaded since they send slightly different calls. LiteLLM reduces that variance.