| |
Count Zero
Registered: Jan 2003 Posts: 1926 |
Codebase64 mirroring idiots?
I know, I know - Internet is not available everwhere but is that really a reason to crawl codebase64.org with a stupid webcrawler hitting each and every URL it sees?
Frantic and me agreed in adding a plugin which will allow an HTML-ized download for on-the-road-reads and we of course hope that intensified reading leads to more submissions! (Stay tuned for that!)
MEANWHILE we are close to banning _ALL OF POLAND_ via IP rules and deny any access to codebase64.org and other sites hosted on the same server.
HOW SICK ARE YOU generating 60gb of traffic for what appears from the logs as mirroring will result as an unbrowsable data blob.
Oh, cant be wasting our bandwidth with such a silly connection but keeping the shit running for over a week shows some dedication at least. If the initiator is reading this: quit it or deal with the consequences. We are open minded and surely give more than we take - you are abusing it.
(BTW, codebase64 is generating about 8gb per month without stupid mirroring attempts, as a comparison to above give number. We'd like to keep reliability and speed so please report any problems to either Frantic or me) |
|
| |
lft
Registered: Jul 2007 Posts: 369 |
Um, I get that kind of traffic on my site as well. I don't think it's a C64 scener doing it. More like a dumb crawler looking all over the internet for email addresses, unprotected forums, security holes etc. Possibly running from hacked machines in a botnet.
Here's an idea: Add some honeypot links (e.g. black-on-black text saying "Click here to rate-limit my IP"). When they are followed, add rate-limiting rules to your configuration, perhaps with a timeout of a few days. Put "nofollow" on the links to prevent proper search engine crawlers from getting trapped. |
| |
wacek
Registered: Nov 2007 Posts: 513 |
Well,
- I have done it before, but it was June 2011, and let me just check... yes, I turned it off then, so it cannot be me ;)
- please add the download button, I for one would really appreciate it, as I use the offline copy all the time. |
| |
Count Zero
Registered: Jan 2003 Posts: 1926 |
lft: all the traffic is coming from 2-3 ISPs and about 10-15 dynamic IPs in poland and due to the nature of dokuwiki redirecting all requests through its main page there are few possibilities to filter. The main thing I wonder are the different ISPs - could it be a very very limited botnet with just about 1MBit of bandwidth? Nah.
Leads me to apache .htaccess or more drastic blocking methods as neither Frantic nor me are in the mood to adjust IP filters on a daily base.
ALL other proper search engine crawlers won't follow all the links but they can handle and extract the real content just perfectly. This thing tries to obfuscates itself by masquerading as Firefox 38 (all the time it seems - any stupid script kiddie has a list of UA strings...).
Lets just hope the responsible person is reading this, just stops it and we get back to a normal traffic level to allow all real 64 ppl proper services. (I bet there is hardly anyone copy/pasting from codebase as much as I do,though :) ).
*IF* you ppl come accross dokuwiki plugins which may improve site experience, let us know! Just like on the rr.pokefinder.org mediawiki installation we are looking for
anything which may improve the user experience.
We'll be experimenting with a few export plugins in the days to come to see which is best. Likely something allowing local HTML browsing then. Yell now if you prefer some wicked .chm or alike :) |
| |
Conjuror
Registered: Aug 2004 Posts: 168 |
I'd be happy if it was login only past the front page. This is too valuable a resource to be abused.
Its not like we need Google to search the content and its only used by a small community anyway. |
| |
Moloch
Registered: Jan 2002 Posts: 2925 |
Abuse with Codebase64 is nothing new, in the past seven years of hosting it was slammed continually by leechers. I had to block various IPs and websites for abuse of bandwidth or too frequent connections. |
| |
Pex Mahoney Tufvesson
Registered: Sep 2003 Posts: 52 |
I've implemented the same thing as lft above for my production websites. A "blank" link like <a href="info.php"></a> ... something a human could not click on, but a spider/script would trap anytime. And -boom- they're trapped in a ip-block-list, and gone. :)
---
Have a noise night!
http://mahoney.c64.org |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
" I don't think it's a C64 scener doing it"
maybe its some1 determined to be a coding god, reading 60gb each week :)) |
| |
Bitbreaker
Registered: Oct 2002 Posts: 504 |
Best move all content over to a facebook group *duck* |
| |
Conjuror
Registered: Aug 2004 Posts: 168 |
And there you'll find some interesting coding discussions and mini compos going on atm :p |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
which group? I'm seriously starving for coder discussions |
... 18 posts hidden. Click here to view all posts.... |
Previous - 1 | 2 | 3 - Next |