Bots and scrapers

Comments & questions about this site.
User avatar
dima
Posts: 1878
Joined: Wed Feb 12, 2014 1:35 am
Location: Los Angeles

Post by dima »

We're having more overloading issues with people's poorly-behaved scripts, and a dumb geoblocker (like I had before) is no longer enough. I just installed fail2ban to kill the worst offenders, and it seems to be doing the job right now. It's possible I tuned it too aggressively: if you see any issues (browser says "site cannot be reached", or something along those lines), please tell me
User avatar
tekewin
Posts: 1398
Joined: Thu Apr 11, 2013 5:07 pm

Post by tekewin »

Unfortunately, we in the Age of Ultron Agents.

I am one of the offenders (not on this site), but I was sending agents out to scour the world for information and getting blocked with 429 errors everywhere. I stopped doing that with limited exceptions and with my own throttles in place. There will soon be far more agents on the Internet than people. That may already be the case.

Peakbagger.com and Bob Burds site are now gated with Cloudflare. We might need to do something similar if it is not cost prohibitive. They have a free plan with DDoS protection, which is what agents unintentionally are doing.
User avatar
dima
Posts: 1878
Joined: Wed Feb 12, 2014 1:35 am
Location: Los Angeles

Post by dima »

Yeah, Cloudflare or something like it would solve it, but I REALLY don't want to go there yet. We're a location-specific, niche, old-school forum about the mountains. We shouldn't NEED such big hammers to be able to operate. I'm wondering if the recent influx was related to the thread about Monica receiving a lot of outside attention, which brough with it lots of additional traffic (both human and robot). In any case, the storm seems to have died down for now (maybe because I blocked everybody and they went home, or maybe not :) ) The current blocking settings maybe are close-enough now. Look at

Code: Select all

/etc/fail2ban/jail.d/defaults-debian.conf
and

Code: Select all

/etc/fail2ban/filter.d/apache-eispiraten.conf
to see the current settings. To see who's banned right now:

Code: Select all

fail2ban-client status apache-eispiraten-hammer
and -misc.