happy new year, and stuff
January 1, 2026
Well I didn't update for a while. Lost a ton of music related stuff (instruments, etc) due to a fire, that sucked. Will write more about that at some point, maybe, though also maybe will try to forget heh.
January 1, 2026
What I'm really posting here is to claim a momentary victory (it won't last long no doubt) against the hoards of scrapers effectively DDoSing web sites out there. Our forums would have 100,000+ "guest" users "browsing," causing all kinds of performance issues. Last night and today, added some stuff requiring some computation via javascript and a little bit of user interaction to authenticate the guest user as not-a-bot, and now we have a few hundred. What a mess the internet is. :/ Sigh.
Update Jan 2:
There are (currently) four layers of scraper/bot/ddos mitigations we are using:
- the latest - when an IP which is not already logged in makes a request, it is checked against a local temporary database. if it has not recently been verified it is verified, by serving a page with some javascript that does some brute force computations, and requires a little bit of human interaction (changing the state of some UI in such a way that a human would). once these are complete, it sends an AJAX request to the server, which quickly validates the results, and if it passes, updates that temp database. the page then refreshes itself to the updated version. Before this change, we'd often see more than 100,000 IPs over the course of a day request pages, often just one or two, each time creating a session which our database would track, etc. Now it's down to a more reasonable 1,000-2,000.
- every minute or two, we run a script that analyzes the last few minutes of apache logs to detect likely scrapers. it looks at the patterns of their requests, and updates a .htaccess file with rewritecond lines to match them and redirect them to a 529 page which tells them to chill. This sometimes causes false positives, when people reopen their browser sessions of like 50 windows on the forum.
- (also recent) - if a non-logged in user requests various obscure pages, e.g. our member list, we refuse. most non-logged-in users would never do this, and they end up being quite a lot of cpu load. so an easy win. maybe good for the privacy of our users, too?
- sometimes scrapers fill up all of our server slots. even when serving them cheap pages, they can still end up holding a precious slot for a few seconds. often they'll keepalive more requests, too. so when we find this happens, we run a script which identifies /24 subnets that are the worst offenders, then adds them to an ipset which we then blackhole. I sometimes spot check these with whois to make sure they are cloud providers. When we see 50+ IPs in a /24 (254 addresses) which are all doing a lot of requests, you're probably a cloud provider (if our forum had a much, much larger audience, then that might be more plausible).
anyway that's the current state of trying to run a forum with 280,000 threads and a few million posts these days. I'm sure it won't last, they will start defeating our countermeasures and we'll have to escalate them. I really do hate requiring JavaScript, some people don't want to do that. sigh.
Posted by Tale on Fri 02 Jan 2026 at 02:01 from 77.170.68.x
Add comment: