this post was submitted on 16 Mar 2025
499 points (97.7% liked)

Privacy

35642 readers
836 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Related communities

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] masterspace@lemmy.ca 77 points 1 day ago (4 children)
[–] B1naryB0t@lemmy.dbzer0.com 39 points 1 day ago (3 children)

When a corporation does it to get a competitive edge, it is.

[–] pennomi@lemmy.world 26 points 1 day ago

It’s only theft if they support laws preventing their competitors from doing it too. Which is kind of what OpenAI did, and now they’re walking that idea back because they’re losing again.

[–] masterspace@lemmy.ca 19 points 1 day ago* (last edited 1 day ago) (1 children)

No it's not.

It can be problematic behaviour, you can make it illegal if you want, but at a fundamental level, making a copy of something is not the same thing as stealing something.

[–] pyre@lemmy.world 9 points 1 day ago* (last edited 1 day ago) (1 children)

it uses the result of your labor without compensation. it's not theft of the copyrighted material. it's theft of the payment.

it's different from piracy in that piracy doesn't equate to lost sales. someone who pirates a song or game probably does so because they wouldn't buy it otherwise. either they can't afford or they don't find it worth doing so. so if they couldn't pirate it, they still wouldn't buy it.

but this is a company using labor without paying you, something that they otherwise definitely have to do. he literally says it would be over if they couldn't get this data. they just don't want to pay for it.

[–] masterspace@lemmy.ca -4 points 1 day ago* (last edited 22 hours ago) (1 children)

That information is published freely online.

Do companies have to avoid hiring people who read and were influenced by copyrighted material?

I can regurgitate copyrighted works as well, and when someone hires me, places like Stackoverflow get fewer views to the pages that I've already read and trained on.

Are companies committing theft by letting me read the internet to develop my intelligence? Are they committing theft when they hire me so they don't have to do as much research themselves? Are they committing theft when they hire thousands of engineers who have read and trained on copyrighted material to build up internal knowledge bases?

What's actually happening, is that the debates around AI are exposing a deeply and fundamentally flawed copyright system. It should not be based on scarcity and restriction but rewarding use. Information has always been able to flow freely, the mistake was linking payment to restricting it's movement.

[–] pyre@lemmy.world 3 points 23 hours ago (1 children)

it's ok if you don't know how copyright works. also maybe look into plagiarism. there's a difference between relaying information you've learned and stealing work.

[–] Grimy@lemmy.world 2 points 22 hours ago (1 children)

Training on publicly available material is currently legal. It is how your search engine was built and it is considered fair use mostly due to its transformative nature. Google went to court about it and won.

[–] pyre@lemmy.world 0 points 22 hours ago (1 children)

can you point to the trial they won? I only know about a case that was dismissed.

because what we've seen from ai so far is hardly transformative.

[–] Grimy@lemmy.world 3 points 20 hours ago* (last edited 20 hours ago) (1 children)

Sorry, I was talking about HiQ labs v. Linkedin. But there is Google v. Perfect 10 and Google v. Authors Guild that show how scrapping public data is perfectly fine and include the company in question.

An image generator is trained on a billion images and is able to spit out completely new images on whatever you ask it. Calling it anything but transformative is silly, especially when such things as collage are considered transformative.

[–] pyre@lemmy.world 0 points 20 hours ago

eh, "completely new" is a huge stretch there. splicing two or ten movies together doesn't give you an automatic pass.

Only if it's illegal to begin with. We need to abolish copyright, as with the internet and digital media in general, the concept has become outdated as scarcity isn't really a thing anymore. This also applies to anything that can be digitized.

The original creator can still sell their work and people can still choose to buy it, and people will if it is convenient enough. If it is inconvenient or too expensive, people will pirate it instead, regardless of the law.

[–] kibiz0r@midwest.social 19 points 1 day ago (1 children)

What OpenAI is doing is not piracy.

[–] Grimy@lemmy.world 1 points 1 day ago (1 children)

Whatever it is, it isn't theft

[–] kibiz0r@midwest.social 15 points 1 day ago (3 children)

Also true. It’s scraping.

In the words of Cory Doctorow:

Web-scraping is good, actually.

Scraping against the wishes of the scraped is good, actually.

Scraping when the scrapee suffers as a result of your scraping is good, actually.

Scraping to train machine-learning models is good, actually.

Scraping to violate the public’s privacy is bad, actually.

Scraping to alienate creative workers’ labor is bad, actually.

We absolutely can have the benefits of scraping without letting AI companies destroy our jobs and our privacy. We just have to stop letting them define the debate.

[–] grumuk@lemmy.ml 3 points 20 hours ago

Molly White also wrote about this in the context of open access on the web and people being concerned about how their works are being used.

“Wait, not like that”: Free and open access in the age of generative AI

The same thing happened again with the explosion of generative AI companies training models on CC-licensed works, and some were disappointed to see the group take the stance that, not only do CC licenses not prohibit AI training wholesale, AI training should be considered non-infringing by default from a copyright perspective.

[–] Grimy@lemmy.world -1 points 1 day ago

Creators who are justifiably furious over the way their bosses want to use AI are allowing themselves to be tricked by this argument. They’ve been duped into taking up arms against scraping and training, rather than unfair labor practices.

That's a great article. Isn't this kind of exactly what is going on here? Wouldn't bolstering copyright laws make training unaffordable for everyone except a handful of companies. Then these companies, because of their monopoly, could easily make the highest level models only affordable by the owner class.

People are mad at AI because it will be used to exploit them instead of the ones who exploit them every chance they get. Even worse, the legislation they shout for will make that exploitation even easier.

[–] FauxLiving@lemmy.world -2 points 1 day ago (1 children)

Our privacy was long gone well before AI companies were even founded, if people cared about their privacy then none of the largest tech companies would exist because they all spy on you wholesale.

The ship has sailed on generating digital assets. This isn't a technology that can be invented. Digital artists will have to adapt.

Technology often disrupts jobs, you can't fix that by fighting the technology. It's already invented. You fight the disruption by ensuring that your country takes care of people who lose their jobs by providing them with support and resources to adapt to the new job landscape.

For example, we didn't stop electronic computers to save the job of Computer (a large field of highly trained humans who did calculations) and CAD destroyed the drafting profession. Digital artists are not the first to experience this and they won't be the last.

[–] masterspace@lemmy.ca 4 points 1 day ago (1 children)

Our privacy was long gone well before AI companies were even founded, if people cared about their privacy then none of the largest tech companies would exist because they all spy on you wholesale.

In the US. The EU has proven that you can have perfectly functional privacy laws.

If your reasoning is based o the US not regulating their companies and so that makes it impossible to regulate them, then your reasoning is bad.

[–] FauxLiving@lemmy.world 3 points 1 day ago* (last edited 1 day ago)

My reasoning is based upon observing the current Internet from the perspective of working in cyber security and dealing with privacy issues for global clients.

The GDPR is a step in the right direction, but it doesn't guarantee your digital privacy. It's more of a framework to regulate the trading and collecting of your personal data, not to prevent it.

No matter who or where you are, your data is collected and collated into profiles which are traded between data brokers. Anonymized data is a myth, it's easily deanonymized by data brokers and data retention limits do essentially nothing.

AI didn't steal your privacy. Advertisers and other data consuming entities have structured the entire digital and consumer electronics ecosystem to spy on you decades before transformers or even deep networks were ever used.

[–] zarathustra0@lemmy.world 5 points 1 day ago

Piracy is only theft if AI can't be made profitable.

[–] _lilith@lemmy.world 2 points 1 day ago (1 children)

Yeah but I don't sell ripped dvds and copies of other peoples art.

[–] Knock_Knock_Lemmy_In@lemmy.world 0 points 23 hours ago

What if I run a filter over it. Transformative works are fine.