• Hi,

    Google Search Console detected some issues on the sitemap generated by XML Sitemap & Google News feeds plugin:

    Sitemap issues

    Here are the details of those issues:

    Sitemap issues details

    I have some questions:

    1. Why Google is showing me URLs in WordPress default format (URL: /?p=367882) instead of the permalink format that I’ve actually set on my website (<year>/<month>/<post slug>)?
    2. Assuming that the first error (Invalid publication label) depends on the second (Missing publication tag), why the plugin doesn’t include this publication tag on the sitemap?
    3. The last error (Unresolved news source) sounds like that our website is not in the Google News Database; it could be very strange because we actually appear there (I can see our posts with a site: search on Google News Homepage)

    Thank you very much for every suggestions,

    Marco

    PS: If you need the link of the website let me know; I have to ask the permission to the owner.

Viewing 8 replies - 1 through 8 (of 8 total)
  • Hi, about the first 2 points: I’d need to see the sitemap live. Can you share a link?

    About point 3, did you subit your site as a news source? If not, please follow the official steps to apply at Google News via https://partnerdash.google.com/partnerdash/d/news#p:id=pfehome
    (but only after resolving the other errors!)

    Thread Starter Marco Panichi

    (@marcopanichi)

    Hi RavanH, thank you for the kind response.

    The site is https://www.lanostratv.it/ ; the news sitemap is located at https://www.lanostratv.it/sitemap-news.xml as usual.

    About point 3, the webiste results correctly included in Google News as I can see from the google news dashboard:

    It’s all very strange because none of the errors that seem to be reported in GWT can be explained. And I can see none of them on the actual news sitemap https://www.lanostratv.it/sitemap-news.xml … if you open the news sitemap in a browser and then view the source (Ctrl+U) you will see no URLs like ?p=xxx and every news entry has the required <publication> tag.

    It is as if GWT is looking at a completely different source. Or even a completely different website. Could you make sure ?

    Maybe use the “View as Googlebot” function to see if the server does not send another response to Google?

    Thread Starter Marco Panichi

    (@marcopanichi)

    Wait a moment: if Google catch the exact url (?p=123) without considering redirects, then errors are plausible. View as Googlebot returns in effect an error if we use the exact url:

    But if Google News is able to follow redirect, then the rendered page is almost perfect (I have some resource blocked by robots.txt, but not critic problems):

    I’ve also looked for Google guidelines about redirects:
    https://support.google.com/news/publisher/answer/93983?hl=en
    This rule gave me some suspects:
    “Make sure you don’t use &ID= as a parameter in your URLs”
    but I don’t think is related to the detected issues.

    PS: I’ve open a similar post on Google News Support Forum:
    https://productforums.google.com/forum/#!topic/news/Us3MMbuqiaM;context-place=forum/news

    No, I mean really look at the news sitemap source. It has NO urls like /?p=xxx there at all. The error in your first post says there are urls like that, but there are not. And there ARE <publication> tags for every entry there, but the Google Search Console error sais there are not… It’s as if Google Search Console is looking at a completely different news sitemap.

    And the error about your site not being included in Google News? That’s just too weird since you show in the next post that is IS included… How can that be?

    So what I was asking is to check what google sees when you enter https://www.lanostratv.it/sitemap-news.xml in View as Googlebot. Does it show the same source as can be seen in the browser (right-click, View source…)

    Thread Starter Marco Panichi

    (@marcopanichi)

    Hi RavanH, thank you again for your support.

    I want to reassure you that the sitemap errors (first post) are related to the website I linked after (second post):


    (see at the website link in the top-right corner)

    Regarding the View as Google test, I’ve never used it to fetch a sitemap! Thank you for the suggestion. Doing that I’ve revealed no problems “unfortunally”:

    Watching at the code however, I’ve realized that even the sitemap is cached by Total Cache:

    <!-- Performance optimized by W3 Total Cache. Learn more: https://www.w3-edge.com/products/
    
    Object Caching 2544/3032 objects using apc
    Page Caching using disk: enhanced (User is logged in)
    
     Served from: www.lanostratv.it @ 2017-01-11 22:32:02 by W3 Total Cache -->

    This could be generate some kind of problem in your opinion? I’m very worried about turning the cache plugin on/off because it is a large website with lots of traffic and I’ve not done any tests yet.

    Might indeed be a good idea testing a not cached news sitemap.

    The cache message at the end there should not cause a problem, however the caching itself might be a problem. You’ll have to make sure the news sitemap is purged from the cache after each new post, or exclude it entirely in the W3TC exclude URL settings.

    But again, how can it be that these errors do not match the real sitemap source. Take a look at that new error about a wrong date being used on line 24 ! It says there is a date tag looking like 0001-11-30T00:00:00+00:00 while in the real source that line 24 shows:

    
    <news:publication_date>2017-01-11T20:58:06+00:00</news:publication_date>
    

    It is simply not true or Search Console is really seeing something else then we are.

    Could you test this? Submit in Google Search Console a new sitemap with URL https://www.lanostratv.it/?feed=sitemap-news and then wait for any errors on that one. The URL should show the exact same news sitemap but I realy wonder if the same errors will be produced…

    Thread Starter Marco Panichi

    (@marcopanichi)

    Hi Ravan,

    I have some updates:

    1) I did the test you suggested, I’ve sent the sitemap manually (*): there were no errors, as aspected!
    2) After few days the above mentioned errors came out again!
    3) Comments at the end of the sitemap (if you are not logged) state that the sitemap isn’t cached: “Page Caching using disk: enhanced (Requested URI is rejected)”

    (*) I’ve sent sitemap-news.xml; now I’ve tried to sent ?feed=sitemap-news

    It is like Googlebot is not able to catch the exact sitemap.

Viewing 8 replies - 1 through 8 (of 8 total)
  • The topic ‘Sitemap issues’ is closed to new replies.