• squaresinger@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    17 hours ago

    Even large projects rarely have hundreds of GB of code. They might have hundreds of gigs of artifacts and history, but not all of that needs to be backed up. That’s where tiered backup strategies come into play.

    Code (or what ever else is the most painful to recover) is backed up in e.g. git, with version history and many different locations.

    Artifacts either don’t need a backup at all, or maybe one copy. If they get lost, they can be rebuilt.

    Temporary stuff like build caches don’t need backups.

    You don’t even need to backup the VMs. Backing up a setup script is enough. Sure, all of this is more complicated than to just backup your whole cloud storage space, but it also requires orders of magnitude less storage.

    • Glitchvid@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      17 hours ago

      In this guy’s specific case, it may be financially feasible to back up onto other cloud solutions, for the reasons you stated.

      However public cloud is used for a ton of different things. If you have 4TiB of data in Glacier, you will be paying through the absolute nose pulling that data down into another cloud; highway robbery prices.

      Further as soon as you talk about something more than just code (say: UGC, assets, databases) the amount of data needing to be “egressed” from the cloud balloons, as does the price.

      • squaresinger@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        16 hours ago

        Retrofitting stuff is of course difficult. If it’s done from the beginning it wouldn’t be that difficult or expensive.

        4TB isn’t that much. That’s small enough that it can fit in a cold backup on a hard drive or two.

        • Glitchvid@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          16 hours ago

          Multi-cloud is far from trivial, which is why most companies… don’t.

          Even if you are multi-cloud, you will be egressing data from one platform to another and racking up large bills (imagine putting CloudFront in front of a GCS endpoint lmao), you are incentivized to stick on a single platform. I don’t blame anyone for being single-cloud with the barriers they put up, and how difficult maintaining your own infrastructure is.

          Once you get large enough to afford tape libraries then yeah having your own offsite for large backups makes a lot of sense, but otherwise the convenience and reliability (when AWS isn’t nuking your account) of managed storage is hard to beat — cold HDDs are not great, and m-disc is pricey.