referral spam from random wordpress blogs
-
Starting today I’ve begun getting seeming referral spam from random wordpress blogs. In my stats page they show up as links to my site but invariably are from archived posts six months or older from random blogs that have nothing to do with linking my site.
I’ve done a fine job with keeping up with the various types of referral and comment spam, but this seems peculiar and I’m not sure how to respond to it.
At this point it’s just a bother… but I’d love to get on top of this before I start getting massive amounts of what seems to be referral spam but isn’t pointing at malicious or ad sites.
-
theres only one minor problem with that, thats a viable user-agent.
for instance, my IE displays:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461; .NET CLR 1.0.3705; .NET CLR 1.1.4322)I dont think anyone is interested in blocking access to all ie 6.0* users. Ive tested it and it does block my browser..
Besides that, i believe you need to escape your .’s : \. which by the way doesnt fix the blocking of IE.
Im simply going to take care of them based on how theyre calling the page — I have no links on my site that include the string “category_name” WP handles that within its own generated mod_rewrite rules, its just a little tweek in the .htaccess.
As far as I know,
SetEnvIfNoCase User-Agent Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) spammer=yes
should block exactly this user agent value:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
. According to my logs, I have no visitors with exactly that user agent. In the end, Whooami is right, always check your access logs for legitimate visitors before using .htaccess to block anything,one sec..aha.
I know what the problem is with what you have .. no quotes, the correct line should be like so:
SetEnvIfNoCase User-Agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" keep_out
doesnt seem to matter id the dots are escaped, I make it a habit to do so though.
One interesting issue I have noticed is that nowadays many comment spams also carries a payload of referrer spams in the same HTTP GET request. So it is a double spam load, reducing their bandwidth usage. It is actually a benefit in disguise for us.
I realized that actually blocking the referrer spammers aggresively (my referrer spammer blacklist) gets rid of most comment and trackback spammers!
There is another trend I have noticed in comment spamming. First there is a meaningless, yet harmless comment in the blog like “Hi” or “Good article”, which most bloggers approve. Then a deluge of spam comes in from the same source, which directly passes through because one comment from the same author has already been approved before.
Thanks for catching that, Whooami.
angsuman, off topic, heh, i used to have a linux server sitting on T1 in my other room — i did small time shell and web site hosting for friends mainly. One of the funnest things I ever did was to redirect all the codered (remember that?) (default.ida?) back to microsoft.com. In fact to this day, I still do that if I see anything in my logs that resembles something sketchy in IIS.
They welcome the traffic. ??
On another note: In the last hour i’ve seen 2 more of the same hits, 2 diff referers. I’ll clearly know by tommorow whether or not this helps (for the time being atleast)
Now that this whole thing has been analysed to death, I have one simple question;
What’s the sense?
All of this time and energy wasted because the internet is a wild west of people who are thoughtless and greedy. What’s the sense of even having a site when most of your effort goes towards ensuring that your back door is locked from intruders? Gone are the days when this was fun and exciting. Now it’s drudgery and it leads reasonable people into situations where negative energy overwhelms the spirit of community. The spammers have successfully turned us on ourselves, and while we bicker, they proliferate their filth and drive up the cost of being part of what many consider to be humankind’s crowning achievement so far.
I am no longer willing to pay the price.
well anyway…
now that ive gotten a hit in my log from, of all places, alexking.org, i tested the code again to make sure it was actually blocking something (only checked that it wasnt blocking something before) and in fact, it wasnt.
Heres the 2 correct ways to block the user-agent I described in one page back (if it happens to come across your site)
No mod_rewrite or just dont like to use it (use macmanx’s way above) but use this:
SetEnvIfNoCase User-Agent ".*(compatible; MSIE 6.0; Windows NT 5.1)" spammer_yes
want to use mod_rewrite:
RewriteCond %{HTTP_USER_AGENT} "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" [NC]
RewriteRule ^(.*) https://%{REMOTE_ADDR}/ [R=301,L]personally, i prefer the mod_rewrite way since the other way actually shows them a page, whereas this mod_rewrite rule redirects them back to the proxy ip (perhaps putting a glitch in their crawling).
nm, again I respectfully submit that everyone is free to do what they like regarding any kind of spam. It seems to me there is plenty of energy on this forum already dedicated to spam, spam plugins, trackback spam, comment spam, you name it spam, that surely one more thread wont be the end of civilization?
It isnt anyone here’s fault that this is “drudgery” and no longer “fun and exciting”, for you, I guess?
For the record its been shown that, statistically speaking, very few “people” are responsible for the vast majority of spam on the ‘net. They are outnumbered by us, if that’s any concilation.
“What’s the sense of even having a site when most of your effort goes towards ensuring that your back door is locked from intruders?”
Firstly, there are entire sites dedicated to nothing else but combatting spam. Surely these people must derive some fun from it?
Secondly, its not really a case of “most” of the effort, its a little bit of effort, atleast for me, with a lot of reward.
Third, do you leave your front door unlocked when you go to bed at night? Im guessing not. ??
the entries above are testable using wannabrowser.com or firefox’s user-agent switcher btw.
and here is another awesome tool for creating kickass .htaccess’s:
https://joseluis.pellicer.org/ua/configure.htmlA Codex search comes up with nothing for ‘spam’ that also has referer there (could be wrong). It would seem that there is a need so unless someone else feels like jumping in, I’ll try and collate some information there so we have somewhere to point people.
I’ve a ton still to learn about this though so expect to have to edit anything I do slot together.If a bot can generate enough processes that it upsets the host, then a blog is in trouble.
If a bit can generate so much traffic that it breaches your bandwidth limit, then a blog is in trouble.
Blog in trouble = costs owner money = posts here.
Let’s head’em off ??4. They all send the same user-agent [ “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)” ]
Thanks for that track-down, whoami. I’ve gotten a slew of these over the past couple of days, all illegitimate.
CG-Referrer does a good job for me (and a few other sites) at stemming the referrer-spam tide, at least when there’s actual detectable spam to be blocked.
If it’s blacklisted words (I’ve got a long blacklist), it catches it and stops further page processing. I’ve caught about 100 spammer-attempts per day (none getting through to the best of my knowledge) without htaccess mods — and that also means that CG-Referrer can track and display ‘blocked’ accesses.
It also supports an IP block table, and a UserAgent table (which has a few known bot-agents). It’d be pretty simple to add the above user-agent string to my UA table if it was guaranteed to never be anything other than a bot… ??
I keep going back to the root Q: is there actually a legal/procedural way to track down and stop some of these guys? I’ve done whois lookups on the last dozen or so spam domains, they all link back to the same fake support domain contact…
-d
david,
sure you have legal avenues — the trick is to have something besides a proxy IP and a legitimate domain running WP to blame ??
here right out my Apache logs :):
80.58.11.107 – – [16/May/2005:00:47:59 -0700] “GET /index.php?year=2004&monthnum=12&day=&name=&page=&paged=12 HTTP/1.0” 200 47595 “https://www.alexking.org/blog/2003/09/09/arches-national-park/” “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)”
Not alot you can do legally about something like that, I spose. ??
Still learning here….
70.85.148.194 - - [07/May/2005:12:26:46 -0400] "GET /T2/cosmos.php HTTP/1.0" 200 0 "-" "-"
IP resolves here: https://www.dnsstuff.com/tools/whois.ch?ip=70.85.148.194
But no UA ?
i’ve noticed this referrer spam in my web logs all of a sudden as well.
in some cases it’s coming from people’s WP sites that are well known and respected here in the forums, so i don’t believe it’s them actually spamming me.
Same as above, this is not visible to anyone but me in my awstats logs, but it’s kind of concerning that it’s only WP blogs that are showing up as spammed entries.
can .htaccess block this type of spam WITHOUT acutally blocking legit referrers in my logs from those sites? (as the sites are legit, just the entries are not).
Thanks.they have port 8080 open for a proxy — but i think thats where you are hosted? or maybe not, I just ive recently seen them mentioned on herer somewhere.
it resolves to tau.asmallorange.com ..
- The topic ‘referral spam from random wordpress blogs’ is closed to new replies.