this post was submitted on 02 Aug 2025
266 points (95.5% liked)

Not The Onion

17491 readers
1009 users here now

Welcome

We're not The Onion! Not affiliated with them in any way! Not operated by them in any way! All the news here is real!

The Rules

Posts must be:

  1. Links to news stories from...
  2. ...credible sources, with...
  3. ...their original headlines, that...
  4. ...would make people who see the headline think, “That has got to be a story from The Onion, America’s Finest News Source.”

Please also avoid duplicates.

Comments and post content must abide by the server rules for Lemmy.world and generally abstain from trollish, bigoted, or otherwise disruptive behavior that makes this community less fun for everyone.

And that’s basically it!

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] FaceDeer@fedia.io 147 points 3 days ago (2 children)

To comply with copyright law, not to skirt it. That's what companies that scan large numbers of books do. See for example Authors Guild v. Google from back when Google was scanning books to add to their book search engine. Framing this like it's some kind of nefarious act is misleading.

[–] masterspace@lemmy.ca 78 points 3 days ago (1 children)

They also weren't destroying rare books. They were buying in-print books from major retailers, which means that while yes, that is environmentally wasteful, it's not actually destroying books in the classical destruction of knowledge sense since the manufacturer will just print another one if there's demand for it.

[–] MrQuallzin@lemmy.world 27 points 3 days ago (2 children)

This as well. Growing up in a house of book lovers, myself included, destroying a book was akin to kicking a puppy. Realistically though, they're ultimately consumables. They're meant to be bought, used, and replaced as needed. With luck the destruction included recycling as much as possible, seeing as it's mainly paper.

[–] masterspace@lemmy.ca 6 points 3 days ago

Precisely, there's a reason that these days, books made for libraries are made to an entirely different standard than books sold at your local book store.

[–] MDCCCLV@lemmy.ca 2 points 3 days ago

Yeah, you have millions of old books that nobody wants not even collectors. It's not just popular literature.

[–] MrQuallzin@lemmy.world 26 points 3 days ago (1 children)

Yeah, this is on the way of being a win. In this case they actually bought the books, which has been one of the biggest issues with LLMs. There's certainly more discussion to be had around how they use the materials in the end, but this is a step in the right direction.

[–] Humanius@lemmy.world 17 points 3 days ago (1 children)

To a certain extent I agree, but you can buy a book and still commit copyright infringement by copying its contents (for use other than personal use)

If this would go to court, it would depend on whether training an LLM model is more akin to copying or learning. I can see arguments for either interpretation, but I suspect that the law would lean more toward it being copying rather than learning

[–] FaceDeer@fedia.io 6 points 3 days ago (2 children)

There's already been a summary judgment in this case ruling that the AI training activity was not by itself copyright violation.

[–] Natanael@infosec.pub 3 points 3 days ago* (last edited 3 days ago) (1 children)

This isn't an automatic complete win for them.

Being allowed to train under fair use rules doesn't mean you're protected if your LLM still regurgitates content.

https://arstechnica.com/tech-policy/2025/07/nyt-to-start-searching-deleted-chatgpt-logs-after-beating-openai-in-court/

[–] FaceDeer@fedia.io 3 points 3 days ago

The lawsuit between NYT and OpenAI is still ongoing, this article is about a court order to "preserve evidence" that could be used in the trial. It doesn't indicate anything about how the case might ultimately be decided.

Last I dug into the NYT v. OpenAI case it looked pretty weak, NYT had heavily massaged their prompts in order to get ChatGPT to regurgitate snippets of their old articles and the judge had called them out on that.

[–] Humanius@lemmy.world 1 points 3 days ago

I see. In that case I stand corrected.