Two authors are suing OpenAI for training ChatGPT with their books. Could they win?

sabbah@lemmy.world · 1 year ago

Two authors are suing OpenAI for training ChatGPT with their books. Could they win?

Brad Ganley@toad.work · 1 year ago

Yeah this is a weird one. I don’t really know how the line gets drawn between training an AI and plagiarism. My gut feeling is that this feels like suing somebody for being inspired by your work or learning a new word from it.

kromem@lemmy.world · 1 year ago

There are already laws regarding producing works too similar to copyrighted material.

Production is infringement, not training.

If I feed all of Stephen King into a LLM such that it learns what well written horror narratives looks like, and it produces a story with original and different plot elements distinct from copyrighted works, that’s fine.

If it starts writing about killer clowns thwarted by child orgies in the sewers then you might have an infringement problem.

And ironically, the best tool for protecting copyrighted material from infringement is going to be…LLMs (acting in a discriminator role comparing indexed copy to protected works).

If ‘training’ ends up successfully labeled as infringement we’re going to end up with much worse long term outcomes in jurisdictions that honor that ruling than we otherwise would.

This is the longer tail masses adopting MPAA math in trying to tally potential losses and in the efforts to protect the status quo are shooting themselves in the foot on laying claim to the future of the industry, inevitably leading to being left out of the next round of growth.

Also, from an ‘infringenent’ standpoint it just means we’ll see less open models and more closed ones which ends up using other jurisdictional models to launder copyrighted materials for synthetic training data.

This is beyond dumb.

Flibbertigibbet@lemmy.world · 1 year ago

Yeah, I’m not sure how I feel about it… But I somehow instinctively feel that a human being “inspired” by other works is different to a neural network being trained on a novel. I don’t know that I can articulate specifically why one feels okay and the other doesn’t… But that’s how it feels to me.

eldrichhydralisk@lemmy.sdf.org · 1 year ago

Part of the problem is that AI research likes to use terminology that sounds like what people do, when that’s not what the AI actually does.

Large language models are not intelligent in any sense. They are autocomplete on steroids. This is a computer program that was fed a book someone wrote, then mathematically tweaked to be able to guess the next word in a sentence in a way that resembles that book. That’s all it does. It does not think or learn in any sense we’d apply to a human.

To me, LLMs sound like a massive plagiarism engine, and I think they should need to get a license from the authors whose works they used to make the LLM under whatever terms that author wants to give, just like a publisher needs to get permission to print a copy of the work. But copyright law has no easy “bright line” for what counts and what doesn’t. So the courts will have to decide whether what the AI “creates” is similar enough to the original works to count as a violation, or if the AI and its results are transformative enough to count as something new.

kromem@lemmy.world · 1 year ago

In part it feels that way because you, along with pretty much every other human being online today, have been propagandized for decades now with SciFi inspired from dystopian futurist predictions around AI which are almost universally clearly obsolete and misinformed by now, but still persist due to anchoring bias.

AI trained to predict collective human thought ends up replicating quite a lot more than most people thought would be possible in our lifetimes.

And yet when it exhibits emotional intelligence it’s called creepy, when it exhibits above average reasoning capabilities it’s called scary, and when it displays a potential for automating large swaths of busywork for most humans it’s called a threat.

Next to no one I see discussing the topic is considering the opportunity costs here, as the media influence on perceiving AI as ‘other’ is so pervasive that most humans fall into treating it like a monkey from another forest competing for bananas rather than treating it like a much better stick.