cross-posted from: https://lemy.lol/post/778691

While I want to block content bots, I don’t want to block useful bots like @[email protected] @[email protected]. Because of this, I will block them one by one. I am sharing it here for community benefit. Any addition/removal is welcome.

gist for programmatic use: https://gist.github.com/ismailkarsli/0c6c7aa4f70d1905adea1b30271f16f7

  • iso@lemy.lolOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    They are not identical much. Maybe we can assume that those who are marked as bots and share around 10-100 posts as bots.

    • marsara9@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      1 year ago

      Maybe. 2nd idea I’ve got is that if no one is replying after say 24hrs and something like 75-80% of your posts are as such and you have at least 100 such posts, you get added to the list?

      Main concern I see about something like this is false positives and how someone real could end up getting blocked.

      I definitely want to think on this some more but it might have some legs.

      • iso@lemy.lolOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 year ago

        I think

        • flagged as bot
        • doesn’t responds in n hours
        • has n numbers of posts in last n hours or overall

        is sufficient to determine a user is a content aggregator bot. Bot flag is an important indicator here. Like the biggest false positive would be ban a multi-purpose bot that also has content aggregation feature.