cross-posted from: https://lemmy.today/post/35487250

I’m looking at self-hosting SearXNG. I have an old Win 11 machine and figure this might be the only way it can be useful.

Two questions I haven’t seen answered so far:

  1. I would be hosting on my own home network, which is on a VPN 24/7, but for added privacy my devices are sometimes on VPN connections to other IPs. So I need to know the external IP of the instance to be able to find it. Are there any added measures I should put in place to prevent randoms looking at IPs or port scanning from finding the instance and going to town?

  2. If this is on my home network anyway, are there any risks of data leaking or triangulation of, say, referrals or image searches that would just point back to my home network?

My threat model is for big tech to leave me alone, so it’s not exactly huge stakes, but I also don’t want to bother self-hosting if added complexity makes it not worth it.

  • JASN_DE@feddit.org
    link
    fedilink
    English
    arrow-up
    4
    ·
    6 days ago

    Needs more details. With that convoluted VPN setup it might work or not, depending on the actual implementation.

    Personally I don’t expose my SearXNG instance to the open net.

    • hansolo@lemmy.todayOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      6 days ago

      What other details are helpful to provide?

      The home network has a VPN running at the router level, so everything in the house is on the same local WLAN (i.e. LocalSend works between devices). But that’s also where all my “Hello bank! Hello Work! Hello paid streaming service and Meta!” activity happens. Other family members are a limiting factor on this.

      Does it make more sense to just run docker locally on my machine and use that as the self-hosting location? Seems like a bit much, but I agree that I don’t really want to expose it to the open internet without…I don’t know, something like just having some password in my password manager. That seems tolerable at least.

      • JASN_DE@feddit.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        The router VPN usually isn’t the issue, as most devices behind it can communicate with each other. What would be the issue with running the service on that old machine and connecting locally, and via VPN while away from home? Would there be anything in your setup that won’t work like that?

        • hansolo@lemmy.todayOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          Not really, but I think it’s more about if the effort is worth it over all vs. just cycling a few public instances. I think I might end up going for that option instead.

  • MalReynolds@piefed.social
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 days ago

    I have a similar setup on my laptop, a docker searxng (well podman rootless, but near enough) locked into a gluetun instance. Works fine, simple to set up, sucks less than any individual search engine and is usefully configurable, but I’m on linux, I expect there’s more pain for windows (linux might be a use for the spare computer…). It’s not resource intensive. Gluetun let’s you expose a local port for searxng and you just point your browser at https://192.168.x.x:8192 or whatever, no need to worry about exit IP. Gluetun is well used and has a focus on avoiding leaks, plenty of eyes on the code, I’ve never had any problems.

    You can wireguard (or tailscale or whatever) into your home network and use it on your phone too. Spin up a pihole for adblock while you’re at it. I say go for it…

    • MysteriousSophon21@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 days ago

      Tailscale has been a game changer for my self-hosted stuff - zero port forwarding headaches and i can access my searxng instance from anywhere without exposing it to the internet (works great with my audiobookshelf server too, been using the soundleaf app to stream my books on the go).

    • hansolo@lemmy.todayOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      Oh, I wouldn’t know about that terrible Windows thing (more than I must). A deb distro is my daily driver, so I was thinking about doing what you’re saying as a more portable alternative that moves easily with any VPN location.

      How resource-intensive is your setup?

      • MalReynolds@piefed.social
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        Cool, you said ‘old Win 11 machine’, I assumed. It’s really not intensive, never know it’s there. I guess it’s a VPN, a few curls, some stats and a light webserver ?

        • hansolo@lemmy.todayOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          Oh yeah, sorry, that’s just the old paperwight I had lying around, not my baby.

  • pontiffkitchen0@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 days ago

    (Not an expert) hosting your own instance will make you more identifiable to big tech than if you used a public instance, but it would still increase your privacy compared to giving everything to them, and also prevent you from giving a public instance your data. I currently use “priv.au” but do plan on hosting my own in the near future. Some people who host their own instance even intentionally open it up to the public to crowd source more data points so that their traffic blends in better (not saying I recommend that though).

    Tldr: it should still be worth it

    In regards to connecting, you should still be able to hop from other vpns to your home network, just keep in mind they you will get higher latency jumping from their VPN network back to yours. I don’t recommend opening it up publicaly just to do that, unless you plan on going all in and having something in front of it like “fail2ban” and Anubis" another option is looking into “tailscale” and if you don’t trust their central server you can selfhost with “head scale” or use a different but adjacent product “pangolin”. These products basically let you creat your our VPN that spans multiple network.

    • hansolo@lemmy.todayOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      Thanks, this is helpful. It sounds like maybe cycling a few known public instances makes more sense for me personally. The inherent MITM aspect always kind of creeped me out, but the results are pretty good, so I always come back to it.

      My only thought on a way to easily have it open internet-facing and still not get overwhelmed would be to put it all behind a bare bones login page with super long credentials and rate limiting and I just save the credentials in a password manager. But if it’s just going to bring Big G looking back at me, I’d rather not bother since that’s the thing I’m trying to avoid.

      Thanks again - this is a huge help.

      • pontiffkitchen0@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        No problem!

        I completely know what you mean, it took a lot of research before I felt comfortable enough trusting a public instance enough to use.

        So that solution would still decrease their ability to fingerprint you by a lot, but really the big problem would all the people/scripts randomly hammering your ip. They wouldn’t get past your password. But it being public and discoverable would meant you’d constantly be getting hit with a bunch of automation scanning your ports. And the security risk isn’t the concern, it’s more the heavy traffic slowing down your connect from them. It sounds like you’d be fine from a security stand point. But you’d have to put up something to block the traffic.

        You could always self host, use that when you’re at home or connected to home through VPN and use it for more personal searches, and then use public instances when you’re connected to other vpns for more general or vague searches. Mixing and matching like that will at least add some noise and make you less identifiable. Kind of best of both worlds.

        • Novocirab@feddit.org
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 days ago

          As a semi-simple compromise it would be cool if there was some way to have the cycling between different Searx instances be done automatically. E.g. either as a browser feature/browser extension, or as some private self-hosted interface to which I send my requests and which then selects the server at random from some subset of the list on searx.space. Or, while a bit hacky, the easiest way could be to do this on the DNS level. Should be doable with just one or two existing tools, with standard tools even.

  • HelloRoot@lemy.lol
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 days ago

    I have my searxng instance open to the internet with crowdsec in front of it and I had 0 visitors (bot or human) except for myself since the beginning of the year.

    Wildcard cert and DNS, so the instance it on an unknown subdomain.