Japan determines copyright doesn't apply to LLM/ML training data

ericjmorey@programming.dev · 2 years ago

Japan determines copyright doesn't apply to LLM/ML training data

ZickZack@fedia.io · 2 years ago

train one with all the Nintendo leaks

This is fine

generate some Zelda art and a new Mario title

This is copyright infringement.

The ruling in japan (and as I predict also in other countries) is that the act of training a model (which is just a statistical estimator) is not copyrightable, so cannot be copyright infringement. This is already standard practice for everything else: You cannot copyright a mathematical function, regardless of how much data you use to fit to it (that is sensible: CERN has fit physics models to petabytes worth of data, that doesn’t mean they hold a copyright on laws of nature, they just hold the copyright on the data itself). However, if you generate something that is copyrighted, that item is still copyrighted: It doesn’t matter whether you used an AI image generator, photoshop, or a tattoo gun.

Japan determines copyright doesn't apply to LLM/ML training data

Japan determines copyright doesn't apply to LLM/ML training data

Taggart :donor: (@mttaggart)