edit: adjusted title slightly

  • Lojcs@lemm.ee
    link
    fedilink
    English
    arrow-up
    136
    arrow-down
    3
    ·
    1 month ago

    …Google started adding links to archived websites in the Wayback Machine

    They better be compensating it…

    • SirEDCaLot@lemmy.today
      link
      fedilink
      English
      arrow-up
      70
      arrow-down
      10
      ·
      1 month ago

      I don’t agree. Free linking has always been a vitally important part of the open internet. The principle that if I make something available on a specific URL, others can access it, and I don’t get to charge others for linking to a public URL is one of the core concepts of the internet itself.

      • AlligatorBlizzard@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        153
        arrow-down
        1
        ·
        1 month ago

        Google killed off their own cached pages last month and they’re now using IA as a replacement. Free linking is definitely important, but this is Google we’re talking about, and them using IA to save money - this feels a lot more exploitative if Google isn’t funding them in some way.

        • Crackhappy@lemmy.world
          link
          fedilink
          English
          arrow-up
          75
          ·
          1 month ago

          I think you’re both right. Anyone should be able to link to an IA page, but Google basically was doing the same thing as IA with their cached pages. Now they’ve gotten rid of that service and are simply relying on IA to take all of the load that they had. I think they should help fund IA to compensate for the extra load.

          • Beej Jorgensen@lemmy.sdf.org
            link
            fedilink
            English
            arrow-up
            21
            arrow-down
            1
            ·
            1 month ago

            I agree they should. But I also agree they shouldn’t be required to. And if they don’t, that we should just live with it as the lesser of two evils.

            • RyeBread@feddit.org
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 month ago

              I would argue regulation should come with (and typically be proportional to) scale. Google as an organization operates at an enormous scale. The scale of the amount links replaced with IA links will be large. The scale in amount in operational costs transferred to another organization is obviously worth it to Google. The sheer scale of everything and everyone involved should require Google to pay Internet Archive. In a decent world that is…

              • Beej Jorgensen@lemmy.sdf.org
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 month ago

                I don’t entirely disagree, but I think defining much of that in effective legal terms is going to be virtually impossible. And I’m super-wary of anything that says someone can’t link to something.

        • SirEDCaLot@lemmy.today
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          1 month ago

          I had not realized that. They should absolutely be allowed to do it, but it’s super shitty of them to basically offload that cost onto IA. IA of course would be well within their rights to try and monetize it. Look at incoming traffic that deep links a cached page and has a Google.com referrer, and throw a splash page or top banner asking for donation.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        26
        arrow-down
        5
        ·
        1 month ago

        There’s a difference between your average Joe linking something and a massive tech company linking something. The first should always be allowed, the second should have an expectation of some form of compensation. That’s why there are differences in licensing terms for lots of services, if you’re using something commercially, you pay a different rate than if you’re using something privately.

        That said, this is on IA to enforce, and I believe they should.

        • SirEDCaLot@lemmy.today
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          4
          ·
          1 month ago

          Strong disagree. If I make a website people like, and Google links to it, should Google have to pay me? If so, Google basically can’t exist. The record keeping of tracking every single little website that they owe money to or have to negotiate deals with would be untenable. And what happens if a large tech journal like CNET or ZDNet Links to the website of a company they are writing an article about? Do they have to pay for that? Is the payment assumed by publicity? Is it different if they link to a deep page versus the front page?

          What you are talking opens up a gigantic can of worms that there is no easy solution to, if there is any solution at all.

          I will absolutely give you that what Google is doing is shitty. If Google is basically outsourcing their cache to IA, they should be paying IA for the additional traffic and server load. But I think that ‘should’ falls in line with being a good internet citizen treating a non-profit fairly, not part of any actual requirement.

          • sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            9
            arrow-down
            1
            ·
            1 month ago

            What you are talking opens up a gigantic can of worms that there is no easy solution to, if there is any solution at all.

            It might if I was suggesting any kind of legislative solution here. I’m not. I’m merely saying that IA should be more selective about how it can be accessed.

            For example, if a journalist is doing a piece about how websites secretly change content, I think it’s entirely reasonable for them to pay for accessing IA for the purposes of that article, because it’s directly related to a commercial endeavor. However, I don’t expect random internet users to pay for access to that same information, because it’s not related to a commercial endeavor.

            In general, you should pay for content that you’re going to use commercially.

            If Google is basically outsourcing their cache to IA, they should be paying IA for the additional traffic and server load.

            And that’s precisely what I’m saying. I’m also taking it a step further and suggesting that IA should be on top of it so companies like Google (who are profiting from their service) pay, while regular internet users don’t.

            • In general, you should pay for content that you’re going to use commercially

              Sure, but merely linking to a page isn’t reusing the content. If said content was being embedded, rehashed or otherwise shown then a compensation would be fair. But merely linking to a page should absolutely be free. That’s a massively important cornerstone of the internet that shouldn’t be compromised on.

              Linking directs traffic which can be monetized by the website itself, it shouldn’t require additional fees on top.

              • sugar_in_your_tea@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                2
                ·
                1 month ago

                There’s a difference between primary content like a website, and secondary content like a cache of a page. I think services doing the latter should be a bit more aggressive about charging fees for commercial entities linking to them, since they’re providing a service separate from the primary source.

      • Avid Amoeba@lemmy.ca
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 month ago

        This view is a bit naive in that it doesn’t take into account a lot of variables. It favors established large actors in their ability to extract and accumulate ever more value from the ones they link.

        • SirEDCaLot@lemmy.today
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          1 month ago

          And, with respect, this view is more naive (IMHO) because it’s focused by size of company, and you can’t do that. You can’t have one set of laws for small companies and another set of laws for large companies.

          So if Google has to pay to link to IA, then so does DuckDuckGo and any other small upstart search engine that might want to make a ‘wayback machine this site!’ button.

          Google unquestionably gets value from the sites they link to. But if that value must be paid, then every other search engine has to pay it also, including little ones like DDG. That basically kills search engines as a concept, because they simply can’t work on that model.

          Thus I think your view is more naive, because you’re just trying to stick it to Google rather than considering the full range of effects your policy would have.

          • Avid Amoeba@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 month ago

            You can’t have one set of laws for small companies and another set of laws for large companies.

            This is false. We can, and we do. Antitrust laws are one example off the top of my head. There are probably others. The assumption that every actor has to pay the same price is false as well. There are countless examples for this.

            • SirEDCaLot@lemmy.today
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 month ago

              Antitrust laws prevent companies from acting in a way to squeeze off competition. Small companies are also prevented from squeezing off competition. Anticompetitive practices are illegal regardless of your size.

              • Avid Amoeba@lemmy.ca
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 month ago

                That’s funny but I’m not gonna argue on it. It’s easier to give another example. If you want to get informed try finding laws that depend on firm size and be convinced if you do.

  • abofim@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    67
    ·
    1 month ago

    op forgot to mention that it is a "provisional, read-only manner,” according to founder Brewster Kahle.

  • Flying Squid@lemmy.world
    link
    fedilink
    English
    arrow-up
    52
    arrow-down
    1
    ·
    1 month ago

    I really hope the rest of the archive comes back soon. I was in the middle of a book and it was a book I hadn’t read since I was a kid.

    Yeah, I could pay for it or wait for it to come via interlibrary loan (it’s not exactly a well-known book), but I really didn’t need a physical copy. And it isn’t even all that long.

    Sigh.

  • Snapz@lemmy.world
    link
    fedilink
    English
    arrow-up
    45
    arrow-down
    1
    ·
    1 month ago

    Capitalism hates a memory. Hates/fears anything it can’t update, whitewash or otherwise directly control or obscure after the fact.

    If humanity had any hope, we’d surround this thing with torches to defend it tooth and nail.

    • TheLugal@lemmy.world
      link
      fedilink
      English
      arrow-up
      68
      ·
      1 month ago

      To you as a user it’s readonly. To the thousands that submits urls for archival it is readwrite.

    • antonim@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      15
      ·
      1 month ago

      You can (well, could) put in any live URL there and IA would take a snapshot of the current page on your request. They also actively crawl the web and take new snapshots on their own. All of that counts as ‘writing’ to the database.

      • SkaveRat@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        6
        ·
        1 month ago

        Not just websites. Basically any digital media. From PDFs, book scans, manuals, floppy disks, CDs, basically anything even remotely worth archiving

        • antonim@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 month ago

          Yep, but I didn’t mention that because it’s not a part of the “Wayback Machine”, it’s just the general “Internet Archive” business of archiving media, which is for now still completely unavailable. (I’ve uploaded dozens of public-domain books there myself, and I’m really missing it…)

  • dread@lemmy.world
    link
    fedilink
    English
    arrow-up
    25
    ·
    edit-2
    1 month ago

    What’s frustrating is that the ones who claimed to have done this are self-proclaimed “hacktivists”. You’re stupid if you think the Internet Archive is the enemy in this day and age.

      • misk@sopuli.xyzOP
        link
        fedilink
        English
        arrow-up
        19
        ·
        1 month ago

        Some anonymous group claimed it was attack on USA for supporting ethnic cleansing in Palestine. This is why they did something that benefited Disney and Nintendo. Makes perfect sense!

        • misk@sopuli.xyzOP
          link
          fedilink
          English
          arrow-up
          15
          arrow-down
          1
          ·
          edit-2
          1 month ago

          IA hosts TONS of user uploaded content. They’re not uploading those Gameboy ROMs themselves.

        • kautau@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          ·
          1 month ago

          The Wayback machine is a crawler, which is big part of what they do but not everything. The Wayback machine crawls its own pages, but you can also submit URLs to be crawled.

          The other part of what they do is hosting a significant number of digital archives of media that is no longer sold / in print / distributed. Much of that content is user uploaded. Like “oh hey I found this old clip art cd from the early 90s. I don’t really have a use for it, but if this doesn’t get uploaded somewhere it’s probably going to be lost to time. I’ll submit it to the internet archives.”

        • pmc@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 month ago

          They do some crawling themselves, but Archive Team (a third party group) does a lot of web archiving as well.

    • pmc@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 month ago

      My most frequent use case of the IA in general is the Cover Art Archive, and I frequently upload cover art for albums to the CAA via MusicBrainz. That’s how I discovered the IA was down, when an upload failed.