Sarah Silverman, Christopher Golden, and Richard Kadrey are suing OpenAI and Meta over violation of their copyrighted books. The trio says their works were pulled from illegal “shadow libraries” without their consent.
Good. Artists should get paid extra for AIs being trained on their stuff. Doing it for free is our job.
But there’s no evidence, in this case anyway, that it was trained using the entire book(s). Multiple summaries of the author’s works are available on various sites in the public domain, and GPT is capable of amalgamating all of them and summarizing it.
Now if you asked it to reproduce an entire book, or say some random non-free chapter or excerpt exactly word-by-word, that would be a issue, but so far I haven’t seen any evidence that it was able to do so.
That’ll come out during the case. I assume they have evidence, otherwise suing would be a waste of time. Unless some lawyer is taking them for a ride.
You only do need 51% certainty to win in civil court, though, so maybe they think they can just argue it? Still though, I’d want some sound evidence before going to court. Unless it’s just a slapp-style suit, but that doesn’t really fit.
That’s an incredibly bold assertion.
Do you never make those?
It’s simple: if you have to pay “copyright holders” for anything you use your AI training on, there can be no AI training. They need to ingest all the data they can to become better and it would cost dozen of billions if you had to pay every single piece of content. So we have to pick between a future in which “copyright holders” fight to get their $ or a future where we can push AI to enter a new era
I really don’t think it should be considered copyright infringement to simply ingest data. It doesn’t infringe on copyright for a person to read a book, why should it matter if it’s a simple program or an AI doing it?
That said, if the AI produces something in the exact words or style of a creator without attribution, just like with a person, then it should count as copyright infringement.
It’s all about perceived harm. A creator is not harmed by an AI reading their works. But they are harmed when the AI can produce their style to potentially take business away from them.
If it isn’t copyright infringement to read a book and apply those ideas to make a product (which it isn’t), then it isn’t copyright infringement to train an LLM with the info in a piece of media.
Pretty cut and dry.
But if you pirated a book to read it, then applied those ideas to make a product, you still committed a crime.
The only good outcome is if copyright is asymmetrical and unfair to big companies. It destroys human culture if Disney sues everybody every time they hum 2 seconds of a cartoon song. It also destroys human culture if every time somebody posts something for free on the internet a deranged billionaire pops up and gloats about how he’s going to bury your post at the bottom of google and copy your answer into his database and use it to scam $100/month out of everybody you were trying to help for free.
Sarah Silverman is going to lose a suit. News at 11. Scraping is protected. This is settled law.
If you read the article, it called out that this is not protected by law. They are claiming open ai got access to her books and works through sites that had illegally obtained it.
This is not covered by previous rulings around scraping.
This once again shows how stupid the idea of copyright is.
The mentioned library genesis project is such a great idea and i use it extensively. It makes scientific articles, papers and cultural works available for everyone, regardless of income.
I understand where these writers are coming from but in my opinion they are working against their own interests here.
I mean currently most of the profits go to the big media coorperations anyway, are we sure ther isn’t a way to fairly pay the artists AND make their works publicly available at the same time?