- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
It has finally happened…not surprised though.
To be true everything we post online can be used for training. Reddit is just made for money :P Kinda using more Lemmy now for posting and reddit just for browsing like archive.
This is horrible news. Reddit is a horrible website and only getting worse. OpenAI promoting them and using their garbage content to train their AI systems is alarming. This is so dystopian.
And of course it always leads back to money:
Sam Altman is a shareholder in Reddit
It’s crazy that reddit doesn’t have to ask everyone if they want to contribute. This shows who owns and controls your posts.
The actual crazy thing is:
Imagine if somebody ran a Lemmy instance and just subscribed to every sublemmy and scraped all the data without asking. And nobody would even notice.
Reddit owns the content posted on their platform. But when you post on lemmy, everybody owns it, including every data company large and small.
But hey, at least we are feeling good about our social media platform choise, cause it’s federated and open source or whatever, right?
I would say a good base assumption is that all content on the public internet is scrapped and used for AI schemes.
It’s the other factors that matter.
Like facebooks threads?
Everyone can use it. With reddit’s posts, only reddit can do it.
This is not so bad. Reddit is crawling with bot spam and that will increase as users leave the platform every time it does a stunt to pump the stock price. The percentage of real/fake content will decrease and will poison the training pipelines. It’s a great experiment to test model collapse in real time, really.