You are polluting the data set. Do it a few times with different text sources and the scrubbers won’t know what part of your comment history is good. Replace, don’t delete.
I’m pretty sure they’ll know that the first version of each comment is almost certainly the good one. People sometimes edit a comment to add new information or fix a typo, but they almost never replace nonsense with a good comment, rather than the other way around.
Edit: fixed typos, also replaced excerpt from Moby Dick with this post.
Edit 2: the comments you post here are totally available for machine learning, so I don’t see much of a point in deleting my Reddit comments as long as I’m participating in Lemmy.
Maybe. Almost every comment I make I edit. The key is that by doing this you are inserting the possibility. It is actually easier, and safer, to just filter out edited comments than it is to try to sort out what’s good and what isn’t. The bottom line is that the best course of action is to avoid Reddit at all cost. If you do go there and feel compelled to comment, then coming back the next day to replace your comments a few times is better than “deleting”.
You are polluting the data set. Do it a few times with different text sources and the scrubbers won’t know what part of your comment history is good. Replace, don’t delete.
I’m pretty sure they’ll know that the first version of each comment is almost certainly the good one. People sometimes edit a comment to add new information or fix a typo, but they almost never replace nonsense with a good comment, rather than the other way around.
Edit: fixed typos, also replaced excerpt from Moby Dick with this post.
Edit 2: the comments you post here are totally available for machine learning, so I don’t see much of a point in deleting my Reddit comments as long as I’m participating in Lemmy.
Maybe. Almost every comment I make I edit. The key is that by doing this you are inserting the possibility. It is actually easier, and safer, to just filter out edited comments than it is to try to sort out what’s good and what isn’t. The bottom line is that the best course of action is to avoid Reddit at all cost. If you do go there and feel compelled to comment, then coming back the next day to replace your comments a few times is better than “deleting”.
They don’t need to filter out edited comments. They keep the first version. It’s good enough.
You could easily compare old vs new and see how much has changed. If more is added, edit is good. If 80% matches, it was probably minor fixes.
If nothing matches, then remove it from the data set and use the original comment. Which I’m sure they still have.