Viewing 12 replies - 1 through 12 (of 12 total)
  • Yes, I’d like to know the same… feeds are ok for rss readers, but for google, .php pages are ok ??

    Anyone? ??

    Thread Starter hostwolf

    (@hostwolf)

    well basically the answer I came up with myself is to add this to the robots.txt

    Disallow: /blog/feed/
    Disallow: /blog/comments/feed/

    But there are still other spots for spiders to access your feeds. I am going to post about removing them.

    That would work, yes, what you suggest, but would it work also for the feeds for every one post and every one comments thread WP automatically generates?…

    URLs indexed by google like:
    https://www.domain.com/wordpress/2007/01/15/title-of-post/feed/

    ???

    Thread Starter hostwolf

    (@hostwolf)

    That what I meant by

    But there are still other spots for spiders to access your feeds. I am going to post about removing them.

    Those are the other spots for the spiders to access feeds. I think one needs to remove these in the php source code but I’m not sure whether that would cause trouble with upgrading in the future.

    I can’t see how to solve this solely with robots.txt

    Thread Starter hostwolf

    (@hostwolf)

    btw the example url you posted are the exact urls I want to get out of the search engines. They are nothing but pollution.

    Yes, I think so, too. The feed example I posted contain SAME text as the posts, so why google should index it?

    “” I can’t see how to solve this solely with robots.txt “”

    I can’t, too, but would like too… Maybe a rule should be created, which will filter everything ending with “/feed”?

    Anyone any ideas on that? ??

    Thread Starter hostwolf

    (@hostwolf)

    Meanwhile I found out that this can actually be done. It doesn’t conform to the robots.txt standards but Google and Slurp accept it so that’s as good as it gets.

    I just implemented

    User-Agent: *
    Disallow: */feed/

    Thx, great, nice tip!

    How can you say it’ll work?

    I have robots.txt and may implement it today or tomorrow…

    (Would be nice also for others to know…)

    If you’re sure it’ll work, we may mark this example as “resolved” and help others as well, right? ??

    Thread Starter hostwolf

    (@hostwolf)

    There is plenty of evidence that this works if you read these SERPS. People at sitepoint also confirmed this.

    Thx! I’ll implement this tomorrow, then, and will check what will happen…

    PS Going to read now… SERPS, of course:)

    Just implemented this trick

    User-Agent: *
    Disallow: */feed/

    …and now waiting to see what’ll happen:)

    I just saw this and wonder, why would you do this?

    The * means any user agent. That also means bots that access your feed. Wouldn’t it mean that the technorati bot couldn’t read your feed? What about other sites that read your feed to what has changed?

    Why not just specifically name googlebot and slurp by name?

Viewing 12 replies - 1 through 12 (of 12 total)
  • The topic ‘Prevent indexing of feed pages?’ is closed to new replies.