Bots and scrapers
-
dima
- Posts: 1906
- Joined: Wed Feb 12, 2014 1:35 am
- Location: Los Angeles
We're having more overloading issues with people's poorly-behaved scripts, and a dumb geoblocker (like I had before) is no longer enough. I just installed fail2ban to kill the worst offenders, and it seems to be doing the job right now. It's possible I tuned it too aggressively: if you see any issues (browser says "site cannot be reached", or something along those lines), please tell me
-
tekewin
- Posts: 1409
- Joined: Thu Apr 11, 2013 5:07 pm
Unfortunately, we in the Age of Ultron Agents.
I am one of the offenders (not on this site), but I was sending agents out to scour the world for information and getting blocked with 429 errors everywhere. I stopped doing that with limited exceptions and with my own throttles in place. There will soon be far more agents on the Internet than people. That may already be the case.
Peakbagger.com and Bob Burds site are now gated with Cloudflare. We might need to do something similar if it is not cost prohibitive. They have a free plan with DDoS protection, which is what agents unintentionally are doing.
I am one of the offenders (not on this site), but I was sending agents out to scour the world for information and getting blocked with 429 errors everywhere. I stopped doing that with limited exceptions and with my own throttles in place. There will soon be far more agents on the Internet than people. That may already be the case.
Peakbagger.com and Bob Burds site are now gated with Cloudflare. We might need to do something similar if it is not cost prohibitive. They have a free plan with DDoS protection, which is what agents unintentionally are doing.
-
dima
- Posts: 1906
- Joined: Wed Feb 12, 2014 1:35 am
- Location: Los Angeles
Yeah, Cloudflare or something like it would solve it, but I REALLY don't want to go there yet. We're a location-specific, niche, old-school forum about the mountains. We shouldn't NEED such big hammers to be able to operate. I'm wondering if the recent influx was related to the thread about Monica receiving a lot of outside attention, which brough with it lots of additional traffic (both human and robot). In any case, the storm seems to have died down for now (maybe because I blocked everybody and they went home, or maybe not
) The current blocking settings maybe are close-enough now. Look at and to see the current settings. To see who's banned right now: and -misc.
Code: Select all
/etc/fail2ban/jail.d/defaults-debian.confCode: Select all
/etc/fail2ban/filter.d/apache-eispiraten.confCode: Select all
fail2ban-client status apache-eispiraten-hammer-
tekewin
- Posts: 1409
- Joined: Thu Apr 11, 2013 5:07 pm
Wow. TIL that fail2ban can secure more than SSH.
To try to understand the config, I fed the .conf files into a friendly AI who gave me this unsolicited comment. Do with it what you will. Your current config seems to be working.
To try to understand the config, I fed the .conf files into a friendly AI who gave me this unsolicited comment. Do with it what you will. Your current config seems to be working.
A maxretry of 20 combined with a findtime of 20 is quite "loose." This configuration allows a bot to make 1 request per second indefinitely without ever getting banned.
Tip: Usually, for aggressive scrapers, you want a longer findtime (like 600 for 10 minutes) or a much lower maxretry (like 5) to catch bots that pace their requests to stay under the radar.
-
dima
- Posts: 1906
- Joined: Wed Feb 12, 2014 1:35 am
- Location: Los Angeles
Oh man. It's totally right. Previously I had problems with it being too aggressive, banning confirmed humans. I detuned it, but I also adjusted the filter regex. After the more specific regex I can probably tighten it again, but I haven't bothered to do that yet. Feel free to play with it. For what it's worth, the onslaught seems to have subsided for now, so maybe we can leave it alone.
-
Nate U
- Posts: 658
- Joined: Wed Apr 05, 2023 7:38 pm
off-trail Los Angeles Mtn explorers and true crime enthusiasts are 2 WILDLY different-sized demographics... this site is not designed to handle the latter.
-
dima
- Posts: 1906
- Joined: Wed Feb 12, 2014 1:35 am
- Location: Los Angeles
The board is super slow right now; we're being bombarded again.
We should see if tightening the fail2ban settings would alleviate it. tekewin: feel free to fix it before I get to it
We should see if tightening the fail2ban settings would alleviate it. tekewin: feel free to fix it before I get to it
-
GoalHiking
- Posts: 44
- Joined: Sun Feb 18, 2024 10:58 am
Whatever you do, please don't use Cloudflare since they're pro-censorship. First they came for, etc etc.
-
tekewin
- Posts: 1409
- Joined: Thu Apr 11, 2013 5:07 pm
I've taken a look at the custom fail2ban configs, the fail2ban logs, and the apache2 logs.
A sample of the access log showed 2000 requests, 1904 unique IPs → ~1.05 requests per IP on average in a five minute block. This is a distributed botnet, not a few abusers. 1527 of 2000 requests (76%) hit /app.php/thankslist — the "Thanks for posts" extension's public list page, each IP hitting one page. None of these were being blocked by fail2ban.
I turned off Guest access to the Thanks list. It shouldn't affect users.
In the phpBB ACP: Permissions → Group permissions → Guests → Advanced Permissions -> Misc -> Can view list of all thanks: No.
I think the custom fail2ban config in apache-eispiraten-hammer.conf is probably catching more users than bots. The one for file downloads looks good. I can improve the apache-eispiraten-hammer.conf patterns after more research on the access log. I'll be out of town until the middle of next so I don't want to make any serious tweaks to it. The only thing changed was the Guest access to the Thanks list.
A sample of the access log showed 2000 requests, 1904 unique IPs → ~1.05 requests per IP on average in a five minute block. This is a distributed botnet, not a few abusers. 1527 of 2000 requests (76%) hit /app.php/thankslist — the "Thanks for posts" extension's public list page, each IP hitting one page. None of these were being blocked by fail2ban.
I turned off Guest access to the Thanks list. It shouldn't affect users.
In the phpBB ACP: Permissions → Group permissions → Guests → Advanced Permissions -> Misc -> Can view list of all thanks: No.
I think the custom fail2ban config in apache-eispiraten-hammer.conf is probably catching more users than bots. The one for file downloads looks good. I can improve the apache-eispiraten-hammer.conf patterns after more research on the access log. I'll be out of town until the middle of next so I don't want to make any serious tweaks to it. The only thing changed was the Guest access to the Thanks list.
-
dima
- Posts: 1906
- Joined: Wed Feb 12, 2014 1:35 am
- Location: Los Angeles
I haven't been following too closely. Do you know if the open-source cloudflare flavors are as effective? Anubis and whatever bugs.debian.org uses and such.GoalHiking wrote: Fri May 22, 2026 11:04 am Whatever you do, please don't use Cloudflare since they're pro-censorship. First they came for, etc etc.
-
tekewin
- Posts: 1409
- Joined: Thu Apr 11, 2013 5:07 pm
I'm not familiar with the open source equivalents. Have no idea.
-
dima
- Posts: 1906
- Joined: Wed Feb 12, 2014 1:35 am
- Location: Los Angeles
Take your sweet time, and thanks for looking at it! Do you see the slowness? Maybe 1/4 of the time when I try to load the board, it takes ~20-30sec for it to come up. Do you see that? Would be interesting to look at the logs during one of those events.tekewin wrote: Fri May 22, 2026 11:16 am I've taken a look at the custom fail2ban configs, the fail2ban logs, and the apache2 logs.
A sample of the access log showed 2000 requests, 1904 unique IPs → ~1.05 requests per IP on average in a five minute block. This is a distributed botnet, not a few abusers. 1527 of 2000 requests (76%) hit /app.php/thankslist — the "Thanks for posts" extension's public list page, each IP hitting one page. None of these were being blocked by fail2ban.
I turned off Guest access to the Thanks list. It shouldn't affect users.
In the phpBB ACP: Permissions → Group permissions → Guests → Advanced Permissions -> Misc -> Can view list of all thanks: No.
I think the custom fail2ban config in apache-eispiraten-hammer.conf is probably catching more users than bots. The one for file downloads looks good. I can improve the apache-eispiraten-hammer.conf patterns after more research on the access log. I'll be out of town until the middle of next so I don't want to make any serious tweaks to it. The only thing changed was the Guest access to the Thanks list.
-
tekewin
- Posts: 1409
- Joined: Thu Apr 11, 2013 5:07 pm
Yes, I've experienced it myself. I got banned for a while last night while I was gathering the log data.dima wrote: Fri May 22, 2026 12:31 pm Take your sweet time, and thanks for looking at it! Do you see the slowness? Maybe 1/4 of the time when I try to load the board, it takes ~20-30sec for it to come up. Do you see that? Would be interesting to look at the logs during one of those events.
I'll look for one of those events happening to a user.
Mainly, I don't want to make things worse.
-
Sean
- Cucamonga
- Posts: 4364
- Joined: Wed Jul 27, 2011 12:32 pm
Yeah, it's kind of annoying when I can't get on the site or can't upload a file because of these attacks.
