cross-posted from: https://europe.pub/post/7719730
cross-posted from: https://europe.pub/post/7719728
Here it is: https://annas-archive.org/blog/backing-up-spotify.html
cross-posted from: https://europe.pub/post/7719730
cross-posted from: https://europe.pub/post/7719728
Here it is: https://annas-archive.org/blog/backing-up-spotify.html
Said library contains petabytes of the exact text of each and every piece of literature.
Said model contains gigabytes of a bunch of weights that can never go back to the exact words of the book.
It’s not strange at all. It’s degrees of compression. You compress a JPEG to the point that it’s unrecognizable, and it’s no longer breaking copyright. It’s essentially like trying to write a book you just read based on memory.
Lol Meta literally torrented 81 TB of data from the site. Stop with this “degrees of compression” bs
And yet, the tech bros do have access to the exact words. The only difference is that they don’t share, instead choosing to extract value from it by training an LLM and (eventually, hypothetically) turn a profit. The product is created by processing the intellectual labor of billions of people into a formless amalgam of human creativity, which is then exploited for their private benefit.