• Hi.

    I have installed my blog (on my own server, MAMP, on my Mac).
    Yesterday I added it to Google Webmaster Tools. I woke up today with multiple errors about the robots.txt-file not accessable. I tested, and I could not access the file either. I have tried to disable all of my plugins (including All In One SEO-pack, Google Analytics for WordPress, Google XML sitemaps and Maintenance Mode), but that did not work.

    I’m pretty lost right now, haven’t got any idea of what to do..
    I know WordPress generates a virtual robots.txt-file, but I should be able to access it throught my browser, shouldn’t I?

    Thanks.

Viewing 15 replies - 16 through 30 (of 33 total)
  • I mean: I dont think robots will completely follow the rules set in a file inside a folder, once they’ll follow what’s in the file in root. But it’s better to avoid conflict, if you are gonna have both files. So, whatever rule you choose, use the same in root and in the directory (just, of course, adapt the path; in root, use /blog/ before anyhting; inside the blog folder evidently it’s not needed).

    Yes, “Disallow: / ” will disallow the whole thing. Leaving it blank (“Disallow: “) will allow the whole thing. That’s your choice, you can set which ever rule you prefer, just like you did in your robots.txt placed in your domain root.

    Moderator t-p

    (@t-p)

    thanks vangrog,

    would placing this:

    User-agent: *
    Disallow: /blog/wp-config.php
    Disallow: /blog/wp-admin
    Disallow: /blog/wp-includes
    Disallow: /blog/wp-content

    in My robots.txt (which is in my domain root) will disallow the whole thing as far as blog is concerned?
    Thanks for your continual guidance help

    If you want to understand better how to set rules, read this:

    https://www.google.com/bot.html

    And just remember that when you set rule for a specific bot (user agent), that’s the ruleset it’ll follow. It means it’ll respect the rules you established for it, and will forget about general rules (the ones when you write: “User-agent: * “).

    Cya

    The rule you wrote above reads like this:

    wp-config is in your root, and it’s forbidden (anyways, that file should be protected with .htaccess, adding it to robots is meaningless)

    folders also forbidden:
    root/gurblog/wp-admin
    root/gurblog/wp-includes
    root/gurblog/wp-content

    All the rest is allowed.

    Moderator t-p

    (@t-p)

    thanks vangrog,

    great help! with your help, I am beginning to understand a bit.

    Last question, before I let you go, having this in MY robots.txt:

    User-agent: *
    Disallow: /blog/wp-

    1) is it going to disallow the whole blog, including indexing of my blog by the search engines?

    Thanks

    Forget the “wp-” part

    Use this: Disallow: /blog/

    That’ll disallow the whole blog, yes: forbids access to any subfolder and any file inside of it and, this way, forbids indexing as well (at least for robots which respect robots.txt, remember that there are many around which are bad bots, for those the only way is to block on you .htaccess).

    Cheers and cya

    Thread Starter emmern

    (@emmern)

    My problem is that I don’t have an virtual file.
    There is no robots.txt file in the root directory of my domain either..

    Thread Starter emmern

    (@emmern)

    Anyone?

    Virtual file, as the name suggests, is virtual. It doesnt really exist, but WP will create it when it’s called.

    A real robots.txt will only be on your host if you do create it yourself, manually, and then upload it. In this case, this file will prevail over the virtual one.

    Cheers

    Thread Starter emmern

    (@emmern)

    Quote:

    My problem is that I don’t have an virtual file.
    There is no robots.txt file in the root directory of my domain either..

    :-p

    Thread Starter emmern

    (@emmern)

    Still not solved. Isn’t there anyone that might have the slightest idea what this comes from?

    Moderator t-p

    (@t-p)

    The way I located my virtual file was to call in my browser like this:
    https://www.mysite.com/blog/robots.txt.

    Then I see this in my browser:
    User-agent: *
    Disallow:

    hope this helps in locating your file.

    I just recently noticed one thing with wordpress (using the latest 2.9.2, also tried on lower versions, standard wordpress and wordpress MU.) .

    I’m currently building a site with only pages (no posts). When visiting the /robots.txt URL on my blog I receive a 404 page not found error instead of a virtual robots.txt file. I could not understand why some blogs I’ve built had a virtual robots.txt and some had not. Because I’m using wordpress mu I rather create a global mu plugin that writes lines to all blogs virtual robots.txt file automatically when called. This enabled me to add functions written to the virtual robots.txt file.

    When I later added a public post to the blog with only pages wordress suddenly decided to activate the virtual robots.txt file. If I delete the posts the virtual robots.txt disappears again. If I put the post/posts in private then the virtual robots.txt file is only available to me as admin.

    This must clearly be a bug in wordpress that still no one has resolved. I find it weird that no one has reported this issue yet.

    Currently the only way around this issue, that I know of, would be to manually add a robots.txt file in your root folder, install a plugin called KB Robots txt or add one post to your blog.

    Hope this helps you m8,
    Cheers

    @emmern : As the robot.txt WordPress generates is VIRTUAL you won’t find a “physical” robots.txt file in your /blog/ directory when you look for it. WordPress generates it on the fly in the same way that it solves pretty permalinks and many other url rewrites on your blog.

    But what I want to know is how to stop wordpress from creating a virtual robots.txt file at all. I also have wp installed in a subdirectory, so it is no use to me to have it generate a virtual robots.txt file.

    Any info on that?

Viewing 15 replies - 16 through 30 (of 33 total)
  • The topic ‘No robots.txt file’ is closed to new replies.