this post was submitted on 29 Mar 2025
823 points (91.6% liked)
Technology
68244 readers
4651 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Mostly youtube, reddit and image search. I guess I could just record a Netflix stream if I needed the whole movie. I guess recording a Netflix stream is pirating? Probably easier with a torrent.
What does it matters? I don't think pirating is unethical especially when it's not even redistribution but transformative. Openai has never stopped me from pirating or even asked me to stop. Not sure what you mean with "no one else".
You ever ask yourself if the memes made from movie scenes used pirated media?
Yes recording at Netflix stream is pirating. That you got away with it doesn't mean you couldn't be sued for tens of thousands of someone found out.
You don't think it's unethical but it is illegal in the US and people have been sued for thousands of dollars. This is still going on today: https://arstechnica.com/tech-policy/2025/02/isp-sued-by-record-labels-agrees-to-identify-100-users-accused-of-piracy/
OpenAI has said they need to violate copyright. But they didn't say that the law should be changed. They want an exemption for themselves.
I'm mostly talking about being able to train on copyrighted content. This is on me though, I got mixed up. That's what I meant in my first comment.
If you think someone can train a model on legally obtained data (Google images, YouTube, internet archive), then that is fair.
Personally, I think using pirated or at least bought content that is ripped (Netflix, DVDs) should be exempt (for everyone obviously, not just OpenAI.) Some data is already behind huge mega corps like record labels, Hollywood, publishing houses, etc. OpenAI can afford the cost but the little guys will be screwed when it comes to SOTA.
It's also worth noting that most current lawsuits are aimed at how the data is used and not how it's sourced if I'm not mistaken. The laws coming from these lawsuits won't be used to bolster anti-piracy laws but copyright laws instead, targeting fair use and transformative clauses imo.