I've been letting my servers run on auto pilot a bit too long, only using updates once in awhile reacting to email alarms lately. I didn't have a full appreciation for how many web crawling bots have been hitting them. In particular my #mastodon instance and my #kallithea repositories. The #SemRush bot was particularly nasty, but #ahrefs and #baidu were also pretty aggressive.
I was getting 500 errors once in awhile on mastodon because of it but didn't clue in until kallithea crashed...
...anyways a little PSA:
Instance admins: even if your site is small be sure to aggressively block web crawlers. More and more they ignore robot.txt etiquette, and in light of recent archiving incidents you want to have some control over distribution of user's public posts.
Fediversians: REALLY BE CAREFUL about what you publicly post. Delete works only on reliable fediverse servers. Evil bots don't respect post deletes when hoarding data.
Micro-blogging site operated by Mark Shane Hayden of Coalesco Digital Systems Inc. We are located in Alberta, Canada. This is NOT intended to be a commercial/promotional site! Registration is open to anyone interested in civil discussions on any interesting topic--especially technology, current events and politics.