You are commenting on a LemmyNet post from the shit just works instance. I am replying from a LemmyNet account on lemmy.world. It could also happen that someone from a Mastodon instance could reply to this comment. Everyone can have their account on a specific instance (even self-hosting their own instance) and still be able to see content from other websites. There is no singular website that hosts all data and there is no singular authority (ok maybe you could argue the developers of the software, but it’s also open source and other options do exist so it’s not a true single point of authority) for the entire network.
You can reply to Lemmy posts or comments on Mastodon and they appear as comments on here. If someone replies to that comment, you’ll see the reply on Mastodon.
Both systems use ActivityPub which is how they can talk to each other.
You do realise that your account on here is public, right? Anyone can collect data like comments, and it’s likely already being used to train AI models just like Reddit data was.
The only way to avoid data being sent to other servers is by having a private account that can’t be viewed while logged out, and never interacting with content from other servers. I don’t think it’s even possible to have a private Lemmy account.
So by your logic, you just want to hand out everything to Facebook on a golden platter. Let them scrape the data if they want but handing out willingly is like smoking cigarette only to get cancer.
The data is already handed out willingly. Anyone can write code that federates with a Lemmy instance using the ActivityPub protocol, subscribe, and receive a feed of all posts and comments, or federates with a Mastodon instance and receives a feed of all posts. They could even hide that data fetching behind a legit Lemmy installation. The entire purpose of ActivityPub is that it’s an open protocol that anyone can use.
The instance you’re on federates with around 5850 servers: https://lemmy.world/instances. Do you really think the admins have verified every one of them to ensure they’re legit?
I understand but the point that I am trying to make is Facebook is a data harvesting company and already got tools and algorithms in place. They can actually do some real damage when compared to a small company or a college student trying to scrape data for a project.
Facebook isn’t know for helping people and making a better change so why risk it?
Arguably all big tech companies do some sort of data harvesting though. Google is primarily an advertising and data collection company, and their data collection is more widespread than others - have you seen how many sites have Google Analytics on it, how many people use Android, and how many people use Gmail, Google Drive, etc? Apple allow data collection as long as it’s them doing it (hence trying to block third-parties from doing it - giving them an advantage).
If you’re worried about data harvesting, the real companies you need to worry about are companies like Acxiom/Liveramp, Experian, Datalogix, Neustar, etc. These are the companies that create profiles on you based on data they gather from a very large number of different sources (credit card data, supermarket reward programs, frequent flyer programs, mailers / TV ads you respond to, internet ads you click, things you buy online, etc) and sell them to advertisers. The big tech companies don’t do anything like that, and generally keep their data to themselves (the data is what makes companies like Google valuable, so they’re not going to just give it to other companies!)
when compared to a small company or a college student trying to scrape data for a project.
How can you be sure that only small companies or students are scraping Lemmy/Mastodon data today? One of those 5800 servers that federate with your Lemmy instance could be funneling data to a data analysis firm.
Can someone ELI5 here. So Lemmy.world is a Reddit like site, Threads is a Twitter like site. What does it mean when they’re federated?
Content and data is shared between instances.
But what does that mean? How would I see content from one on the other?
You are commenting on a LemmyNet post from the shit just works instance. I am replying from a LemmyNet account on lemmy.world. It could also happen that someone from a Mastodon instance could reply to this comment. Everyone can have their account on a specific instance (even self-hosting their own instance) and still be able to see content from other websites. There is no singular website that hosts all data and there is no singular authority (ok maybe you could argue the developers of the software, but it’s also open source and other options do exist so it’s not a true single point of authority) for the entire network.
Have you never talked to someone on #Mastodon yet? They all use ActivityPub as a protocol.
I’m on Mastodon also. But how do they connect?
You can reply to Lemmy posts or comments on Mastodon and they appear as comments on here. If someone replies to that comment, you’ll see the reply on Mastodon.
Both systems use ActivityPub which is how they can talk to each other.
I told you already.
https://en.wikipedia.org/wiki/ActivityPub
deleted by creator
That’s means that I can see your content from lemmy.zip
Your IP, comments, username , liked post, disliked post and other identifiers will be shared with facebook.
Defederating doesn’t stop that, it just stops that their IP gets shared with us.
With FB and every other instance. Are you vetting who has controlling stake in every instance you federate with?
You do realise that your account on here is public, right? Anyone can collect data like comments, and it’s likely already being used to train AI models just like Reddit data was.
The only way to avoid data being sent to other servers is by having a private account that can’t be viewed while logged out, and never interacting with content from other servers. I don’t think it’s even possible to have a private Lemmy account.
So by your logic, you just want to hand out everything to Facebook on a golden platter. Let them scrape the data if they want but handing out willingly is like smoking cigarette only to get cancer.
The data is already handed out willingly. Anyone can write code that federates with a Lemmy instance using the ActivityPub protocol, subscribe, and receive a feed of all posts and comments, or federates with a Mastodon instance and receives a feed of all posts. They could even hide that data fetching behind a legit Lemmy installation. The entire purpose of ActivityPub is that it’s an open protocol that anyone can use.
The instance you’re on federates with around 5850 servers: https://lemmy.world/instances. Do you really think the admins have verified every one of them to ensure they’re legit?
I understand but the point that I am trying to make is Facebook is a data harvesting company and already got tools and algorithms in place. They can actually do some real damage when compared to a small company or a college student trying to scrape data for a project. Facebook isn’t know for helping people and making a better change so why risk it?
Arguably all big tech companies do some sort of data harvesting though. Google is primarily an advertising and data collection company, and their data collection is more widespread than others - have you seen how many sites have Google Analytics on it, how many people use Android, and how many people use Gmail, Google Drive, etc? Apple allow data collection as long as it’s them doing it (hence trying to block third-parties from doing it - giving them an advantage).
If you’re worried about data harvesting, the real companies you need to worry about are companies like Acxiom/Liveramp, Experian, Datalogix, Neustar, etc. These are the companies that create profiles on you based on data they gather from a very large number of different sources (credit card data, supermarket reward programs, frequent flyer programs, mailers / TV ads you respond to, internet ads you click, things you buy online, etc) and sell them to advertisers. The big tech companies don’t do anything like that, and generally keep their data to themselves (the data is what makes companies like Google valuable, so they’re not going to just give it to other companies!)
How can you be sure that only small companies or students are scraping Lemmy/Mastodon data today? One of those 5800 servers that federate with your Lemmy instance could be funneling data to a data analysis firm.
So you want to federate with facebook just because you think may be other small companies are doing it. Is that correct ?