There is no way to prove it didn’t just scrape 10 other summaries and reworded them slightly. And given the nature of such language models and limited context length it’s actually more likely, than it understanding and summarizing an entire book.
They could subpoena people who actually know how openai did it.
That’s a bold assumption that openai even knows. Part of the magic of how their large language model works is non-inversion. You cannot take an output and derive backwards to a precise input ad the inputs are no longer present in the tokenization chain that’s formed during the learning process. This is a byproduct of all currently language learning models AFAIK. Building in the ability to enable reversible computation would add infathomable complexity in these types of systems.
They know the training data sources.
Not necessarily: Facebook has used a public-private-partnership with a German university to let them train the model on publicly available data, no matter the copyright status. The university is allowed to do this, since science enjoys a lot of defined rights, which rank higher than commercial copyright in Germany specifically (but I can imagine in other places as well). Facebook just received the model. This is obviously a ploy for plausible deniability and morally wrong, but it hasn’t been challenged in court yet and is believed to hold up currently. I can imagine OpenAI to be smart enough to have one or more layers of buffering between themselves and the dataset as well.
Yet, Awad and Tremblay believe their books, which are copyrighted, were unlawfully “ingested” and “used to train” ChatGPT because the chatbot generated “very accurate summaries” of the novels, according to the complaint. Sample summaries are included in the lawsuit as exhibits.
I predict they are going to lose & lose bigly. as neither one of these things would be illegal if a human did it.
I’ve noticed that if a human makes a painting in the style of a specific artist, people tend to classify that as taking inspiration. Whereas if Stable Diffusion makes a painting in the style of that same artist, artists tend to be outraged about it.
I’m not ready to make a moral judgement one way or the other, but I do notice that people seem to treat both cases differently.
I believe the difference is how old the original artworks are. Someone copying the style of Mozart is appreciating his music, but someone copying a modern day musician is just breaking copyright law.
You can be inspired by Picasso or even Bob Ross, but you are copying/tracing an artist from Tumblr.
In a way AI is just pastiche, but frownd upon.I’ve written music for many years, mostly dance stuff. Of course I’m concerned that AI will come for my job too very soon.
I was thinking about your example… I don’t think the age of the works is a factor. It’s about similarity. For example, I decided to start switch things up and start writting really savage, heavy ‘big beat’ stuff a few months ago. It’s heavily, heavily inspired by The Prodigy because they’ve been musical heroes of mine from a young age (lucky enough to have seen them live back in the day), nobody else really writes music like that, so it’s my only reference for ‘savage heavy big beat’.
None of the 4 tracks I’ve finished sound remotely like any track by The Prodigy, but the first thing most people say when hearing it is “wow that sounds like The Prodigy”. While someone somewhere will bitch about it and say I’m not being creative enough or I should have my own sound, no-one in their right mind would say I’m breaching copyright law by writing brand new music influenced by their distinct sound that I’ve absorbed through decades of listening.
If I really wanted to, I could do the same with many artist’s sound in a short space of time by focussing on it. Perhaps a bit ethically dubious but still not illegal.
If I trained an AI model to ‘absorb’ music from artists that I’ve purchased, and told it to spit out new music that resembled their sound without copying it, would that be illegal? It’s an interesting debate imo.
While someone somewhere will bitch about it and say I’m not being creative enough or I should have my own sound
There isn’t an artist dead or alive that doesn’t use another artist as inspiration or even “borrow” from them. I’m sure The Prodigy has bands and music that they draw on to make their stuff. You probably already know all of this but you keep doing your shit man and screw the haters.
Hell, I have to do an edit pass on my books to make sure I didn’t pull too much inspiration.
You can do what I did in my 3rd grade report on Egypt. I put quotes at the beginning of my paper, copied word for word the introduction to a n “Egypt for Kids” book and closed quotes. Didn’t plagiarize- quoted all my sources…
Teacher had a parent conference the next week…
Exactly, we’re all stood on the shoulders of giants. Plus, The Prodigy are notorious and proud sample / sound / melody pirates, so I’m just following in their footsteps haha. Thanks for saying that though!
I was excited to see a mention of “Big Beat”, but saddened to see no mention of “Fatboy Slim”. :(
I can only apologise profusely, right here, right now.
To me ai stuff needs to be disclaimed
I like to use ai art and text generation but just becuase I think it’s neat to see a wizard battling a mech in a steam punk setting
As for people who actually publish the art I always believe things such as the fact it’s an AI, prompt, seed, and what ever else techno glibble glob is used to make the image or story
AI is a tool and a lot of people are trying to power drill a nail in
Should artists be required to list off their whole toolchain for every piece of art?
Ha point taken
The way I see it though it’s much easier to plug into a prompt than actually learn the skills if that makes sense
At least with Stable Diffusion, there’s a lot more to it than just plugging in a prompt. It’s an iterative process with inpainting, outpainting, adjusting generation parameters, upscalers, etc.
Disclaimer: My banner and profile picture are AI generated.
I think its because it takes some talent and effort to reproduce an artists style. If someone just makes Stable Diffusion do it, there isn’t much effort put in on their part so it seems like they are taking advantage.
Sure, but should legality be based on artistic effort? (Not asking you directly, just open to anyone who thinks what SD, etc. do should be illegal.)
I mean wouldn’t it only matter if its being used for commercial usage? Like yeah the commercialization of unedited AI art is iffy but if we are talking about issues with copyright wouldn’t you evaporate a huge part of the internet. Meme culture is literally stealing someone’s art and repurposing it.
Edit: as the other user pointed out, effort shouldn’t be the defining trait because if I made a piece in 2 hours its somehow belittles because some artist who worked on marble spent months if not years sculpting a piece.
Is it illegal if I read the book, write a summary and post the summary online?
I guess you’ll find out soon. Calling the FBI on you rn, lmk if they show up.
If this leads to book reviews becoming illegal, this could backfire heavily on those authors.
They will lose…
Makes sense. Google dealt with something similar years ago.
deleted by creator