• Hello,

    I will work on making this descriptive so my chances of getting it resolved are higher.

    I’ve been trying to resolve this for hours now –

    * I attempted to submit my sitemap to Google – however they came back and said it is being blocked by my robots.txt. There is no robots.txt on my server.

    * But, I called up my host (GoDaddy) for the 3rd time and told them this. They had guys looking at it for about 45 minutes. The only thing they could derive from this issue is that my .htaccess is causing the robots.txt to write automatically. Therfore blocking Google access to my site.

    The following is the URL of my sitemap (which you probably won’t be able to access: https://www.thetexasreport(dot)com.

    Remember, I do not have a visible robots.txt file anywhere on my server. I’ve went through every folder.

    Can someone please help?

    Thanks in advance,

    Joel

Viewing 8 replies - 16 through 23 (of 23 total)
  • Lol..I have Been reading searching testing and tweeking my blog for a week
    trying to get this same problem resolved… I just checked the privacy settings and apparently it was set to block all search engines… lol
    Just resubmitted and I have a feeling that it will take since my robots text is actually allowing the / directory… lol what a f****ing joke..
    ***The thing is I am using the xml site map plugin … I am used to building them old school and tossing it into one of the directories…
    worked great in the past but how can I do that with a blog seems that content being dynamic and all I would have to refresh it every three days or so which doesn’t seem efficient but at the same time these plugins seem to be leaking “link Juice” with all of the out bound links in them…. Any Thoughts

    Lol..I have Been reading searching testing and tweeking my blog for a week
    trying to get this same problem resolved… I just checked the privacy settings and apparently it was set to block all search engines… lol
    Just resubmitted and I have a feeling that it will take since my robots text is actually allowing the / directory… lol what a f****ing joke..
    ***The thing is I am using the xml site map plugin … I am used to building them old school and tossing it into one of the directories…
    worked great in the past but how can I do that with a blog seems that content being dynamic and all I would have to refresh it every three days or so which doesn’t seem efficient but at the same time these plugins seem to be leaking “link Juice” with all of the out bound links in them…. Any Thoughts

    In other words, you don’t need to *do* anything. Your site is now correct. Google needs to notice that, and you need to wait for it to do so.

    I see what your saying here but in webmaster tools you should be able to resubmit the site map and then not get the error again…. Google says it may take a few hours but every time I submit it takes 20 minutes or so….
    therefore It would suck to wait around on the problem a few days and not have it resolved… Lol, when ever my host doesn’t know how to fix the problem they always tell me to take an action and then wait… lol… It waiting has yet to fix any of the problems.

    raeph

    (@raeph)

    Uff! that thread saved me from getting completely crazy…
    thanks for clarification Otto42!

    moongoose

    (@moongoose)

    I am having a similar problem with a client’s site that was indexed fine by Google and suddenly is not being indexed. I tried submitting a sitemap after the problem started, and I received the error:

    Network unreachable: robots.txt unreachable

    However, the robots.txt file that is being autogeneratd by WP is fine — nothing is blocked.

    I suspect that my web host may be blocking some of Google’s IP addresses. Here is a post I found related to this:

    Besides a problem at Google’s end, this issue is most commonly caused by the fact that the hosts are blocking one or more of the Google’s IP addresses. This is why I advise you to contact your host and ask them to check whether they are blocking any Google’s IP addresses. You can find a list with these IPs at the following URLs:

    https://www.webmasterworld.com/forum24/517.htm

    and

    https://www.phpbb-seo.com/en/seo-prin…cle-t2169.html

    hey all,

    Any idea how long it might take for robots.txt status to change in googles cache?

    I changed the privacy settings about 15 hours ago and Google still thinks it’s been blocked. I would’ve thought they’d check the robots.txt status as you submit the sitemap.

    fyi, my robots.txt is here: https://suklaa.org/robots.txt
    and sitemap I’m trying to submit is here: https://suklaa.org/sitemap.xml

    I’m on MediaTemple and my other sites don’t have this problem so I don’t think they’re blocking Google IPs.

    Thanks!

    Scott Winterroth

    (@countrymusicchicago)

    I’m having the same problems with Google. Everything was great then boom it stopped crawing my site.

    I’ll throw a few things into the ring here as I recently had this infuriating problem too.

    Firstly I thought it was a problem with the Google XML sitemap generator plugin, then suspected Bad Behavior, and finally that it was a server security setting or similar.

    As someone already mentioned, make sure first of all that your privacy settings from within WP-admin have been changes to allow search engines or you will get nowhere ??

    I eventually solved it with a combination of things, although I’m still not sure exactly why it happened, it’s fixed and at least the crawlers can reach my sites again: here is what I had to do:

    1) Deleted my robots.txt files from the root (not ideal but for now they are gone)
    2) Ensured that the option to add sitemap to robots.txt in the XML sitemaps plugin was turned off.
    3) Check your .htaccess files
    (for some reason there were some lines in my htaccess file that I certainly did not add manually and must have been generated by either a plugin or something I changed via Cpanel – I have no ideas but if anyone can suggest the likely culprit I’d love to know. Here is a sample of one of those lines:

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^.*(bot|urp|msn|google\.|oogle\.|msn\.|live\.com|yahoo\.|altavista\.com|looksmart\.com).* [NC]
    RewriteRule ^(.*) /2.html [NS,NC,L]

    If you have anything similar delete it. I deleted them all and they seem to have been the major culprit.

    4) Rebuild your sitemap manually in the Google XML Sitemap plugin configuration.
    5) Try resubmitting your sitemaps to Google via Webmaster tools. It may take a couple of attempts.

    You can check GoogleBot’s responses to your site as well as Yahoo Slurp and Bing by using the following tools and setting the appropriate user agent:

    https://www.seoconsultants.com/tools/headers/

    and

    https://web-sniffer.net/

    I was getting 500 internal server errors when checking the root domain as well as my sitemaps and robots.txt files. If you are getting the same, you still have work to do but once you get a 200 success result you should be close to getting the crawlers to come back.

    As well as submitting your site maps to Google it is probably a good idea to use the webmaster tools at Bing and Yahoo to make sure they come back again too – there are links to both within the sitemaps plugin.

    Hope this helps someone out.

    Maurice

Viewing 8 replies - 16 through 23 (of 23 total)
  • The topic ‘.htaccess & robots.txt problem’ is closed to new replies.