Options

Robots.txt Unreachable - Sitemaps

245

Comments

  • Options
    richpepprichpepp Registered Users Posts: 360 Major grins
    edited December 15, 2010
    Seems to be working again. The 'Pages crawled per day' graph in Webmaster Tools has jumped from zero since the end of November back up to a couple of thousand where it is normally

    Thanks for that

    Richard
  • Options
    denisegoldbergdenisegoldberg Administrators Posts: 14,246 moderator
    edited December 15, 2010
    I see no change from my post here. I did run the Labs > Fetch as googlebot, and the submitted / downloaded dates reflect that - but the status remains as it was before. /sitemap-galleries.xml.gz looks fine (as it did before), but /sitemap-images.xml.gz still shows robots.txt unreachable.

    --- Denise
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 16, 2010
    Twoofy wrote: »
    Hello,

    I have been looking at this problem for several days and have sorted out what is going on with this problem regarding the robots.txt and sitemap files. Google should refetch your robots.txt sometime in the next 24 hours and everything should free up.

    If you want to test this for yourselves, you can go into your Google Webmaster tools then navigate to "Labs > Fetch as Googlebot". For the URL type "robots.txt" and leave the selector on "Web". When you submit it, the status will be "pending" and within a few seconds (you may need to refresh the page), the status should be "Success!" - you can then click it and see the contents of that file. I will monitor this thread for the next couple of days - if you want to try it and post your results here.

    I am still working on the issue with individual URLs within the sitemap not being escaped properly, my expectation is that this fix will go live around Thursday this week (12/16).

    Thank you to everyone for letting me know about this and your patience. A special thank you to the Smugmug user who graciously gave me access to your Google account, your kindness saved me a weekend's worth of work (on my birthday even) tracking this down. :)

    - Greg


    Is the fix for the unreachable links still due to go live today pls?
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 18, 2010
    I still have many unreachablke links (though it has dropped from 3000+ to 2900), but now the sitemaps index only contains the base, no sign of the galleries or images.
    <sitemapindex xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
    &#8722;
    <sitemap>
    <loc>http://www.snrmac.com/sitemap-base.xml.gz</loc>
    <lastmod>2010-12-16T17:17:11Z</lastmod>
    </sitemap>
    </sitemapindex>
    
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 18, 2010
    I had a green check mark for a couple of days for my sitemap, but now I'm back to a yellow '!' sign in Google Webmaster Tools.
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 19, 2010
    same here, galleries and images is listed again, but with an 'X' again. Unreachable links are also the same.

    Please could we have an update!
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 19, 2010
    My guess is that when you click the sitemap with the "!" that you are seeing a "Submitted" date of something like Dec 17, 2010. And a "Downloaded" date of something similar. But, that when you look at the "Details" it will say something like "Problem detected on: Dec 12, 2010".

    If that is the case, what it means is that Google has not finished processing your sitemap yet - so it is showing error details from the last time it did so.

    If that is not the case, I need to see the URL it is complaining about to see what is wrong with it.

    - Greg
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 19, 2010
    its not an "!", but an 'X' with error message -

    error.png - General HTTP error: 404 not found
    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit. HTTP Error: 404
    Problem detected on: Dec 17, 2010

    and that was for - /sitemap-galleries.xml.gz and its the same for /sitemap-images.xml.gz
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 19, 2010
    psenior1 wrote: »
    its not an "!", but an 'X' with error message -

    error.png - General HTTP error: 404 not found
    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit. HTTP Error: 404
    Problem detected on: Dec 17, 2010

    and that was for - /sitemap-galleries.xml.gz and its the same for /sitemap-images.xml.gz

    Did you try resubmitting them? Also, have you used the Lab tool in webmaster tools to test them? I just tried both and fetched them with no problem.

    - Greg
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 19, 2010
    Twoofy wrote: »
    Did you try resubmitting them? Also, have you used the Lab tool in webmaster tools to test them? I just tried both and fetched them with no problem.

    - Greg

    I resubmitted them earlier, tried again and now galleries has a tick, image however still gets an 'X' and the following errors (there are more of the same, these are just the first few) -

    1131511419_UGNfW-L.jpg
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 19, 2010
    Just looked @ GWMT.

    I still have the yellow triangle with '!'.

    I clicked on it, and here is the info it provided:

    "Sitemap errors and warnings
    Line Status
    URLs not followed
    When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.


    Details
    Warnings -
    URLs not followed
    When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.
    HTTP Error: 302
    URL:
    http://www.ashbrook-photography.com/Other/Test-III/14036042_oyi5w
    Problem detected on: Dec 17, 2010"


    The URL that is noted in the error is one that has not been a part of (has not existed on) my site for more than 2 months.

    In a separate issue, there was a problem where my sitemap had not updated in quite some time - but I had thought that part was fixed.

    ?
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 20, 2010
    Does it seem like the URLs that are re-directing are ones that have been removed?

    I'm not sure what you mean by sitemap not updating? I am pretty sure that it is - can you elaborate?

    - Greg
  • Options
    OffTopicOffTopic Registered Users Posts: 521 Major grins
    edited December 20, 2010
    Seem to going backwards here...instead of at least having a green check mark for the galleries index as I posted [url="http://www.dgrin.com/showpost.php?p=1517574&postcount=16]here[/url], now I'm back to square one with nothing but a warning icon for sitemap-base.xml.gz.

    URLs unreachable
    When we tested a sample of the URLs from your Sitemap, we found that some of the URLs were unreachable. Please check your webserver for possible misconfiguration, as these errors may be caused by a server error (such as a 5xx error) or a network error between Googlebot and your server. All reachable URLs will still be submitted.

    Shows that my /include folder shows a 403 error Dec 13, 2010

    Show URLs: HTTP ‎(1)‎ In Sitemaps ‎(250)‎ Not followed ‎(1)‎ Timed out ‎(15)‎ Unreachable ‎(5,349)‎

    The unreachable URLs are all of the photos on my site and my main domain:

    http://www.loricareyphoto.com/
    robots.txt unreachable Dec 12, 2010
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 21, 2010
    OffTopic wrote: »
    Seem to going backwards here...instead of at least having a green check mark for the galleries index as I posted [url="http://www.dgrin.com/showpost.php?p=1517574&postcount=16]here[/url], now I'm back to square one with nothing but a warning icon for sitemap-base.xml.gz.

    URLs unreachable
    When we tested a sample of the URLs from your Sitemap, we found that some of the URLs were unreachable. Please check your webserver for possible misconfiguration, as these errors may be caused by a server error (such as a 5xx error) or a network error between Googlebot and your server. All reachable URLs will still be submitted.

    Shows that my /include folder shows a 403 error Dec 13, 2010

    Show URLs: HTTP ‎(1)‎ In Sitemaps ‎(250)‎ Not followed ‎(1)‎ Timed out ‎(15)‎ Unreachable ‎(5,349)‎

    The unreachable URLs are all of the photos on my site and my main domain:

    http://www.loricareyphoto.com/
    robots.txt unreachable Dec 12, 2010

    That error is from before the bug fixes went out (12/17). I believe the reason they are still showing up is because Google has not finished processing your new sitemap file yet. One thing you may want to do is modify something about one of your photos (adding or removing a keyword works well), that way your sitemap file gets updated.

    - Greg
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 21, 2010
    Hi Greg -

    As far as I know - after the work you did - the sitemap is updated to a more recent index of my site. True.

    The error I noted above - I just checked - is still there now. Yes, it does call out a URL that is no longer part of my site, and that is not listed in my sitemap - so - maybe its one of those cases like you mentioned to me - where I should sit tight and wait a bit.

    I'll keep an eye on it - if the error isn't gone in a week or two I'll post back. Thanks again and happy Holidays ;-)
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 24, 2010
    I am still trying to make heads or tails of this whole sitemap issue.
    I tried re-keywording all my images and resubmitting the sitemap-images.xml.gz.

    It returned the following result:
    1133446453_LKSrL-L.jpg

    When I tried to see the sitemap using the blue "See Sitemap" link in the upper right corner it returned a broken link error, as if there was no sitemap for the images.
    ne_nau.gif

    Am I missing something here?

    How is it looking now? You only need to change one keyword, btw, that will trigger a sitemap update.

    If you are still seeing warnings, try updating an image (add, change, or delete a keyword works best), then in Webmaster tools re-submit the sitemaps.

    Any warnings that are prior to the date you do that are referring to an older, potentially problematic sitemap file. So we need to wait until the error either reflects the date you changed/resubmitted or (hopefully) goes away entirely.

    .. Let me know how it works for you...

    - Greg
  • Options
    socksiongsocksiong Registered Users Posts: 44 Big grins
    edited December 25, 2010
    sitemap is still down, pls smugmug fix it asap.
    although seo can still be done without sitemap (got myself to number 6 for my keyword in 1 month with faulty sitemap), but sitemap is quite favourable on the big G.
    my sitemap
    http://www.socksphotography.com/sitemap-index.xml

    normal sitemap
    http://nikonrumors.com/sitemap.xml

    my xml file only has text.
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 25, 2010
    socksiong wrote: »
    sitemap is still down, pls smugmug fix it asap.
    although seo can still be done without sitemap (got myself to number 6 for my keyword in 1 month with faulty sitemap), but sitemap is quite favourable on the big G.
    my sitemap
    http://www.socksphotography.com/sitemap-index.xml

    normal sitemap
    http://nikonrumors.com/sitemap.xml

    my xml file only has text.

    Have you updated, added, or deleted a keyword? How long has it been since you've done that? Keep in mind that I fixed the robots.txt on Dec 16th and a separate encoding issue on Dec 23rd. So errors that are showing dates from before the 23rd are likely just old ones that are on the Webmaster Tools database - but not affecting your site. If you are showing any errors with dates newer then Dec 23rd then I really do need to see specifically what they are so I can hunt those down too.

    - Greg
  • Options
    socksiongsocksiong Registered Users Posts: 44 Big grins
    edited December 25, 2010
    I just updated my keywords, uploaded a new gallery on 23rd. my sitemap is empty so even if I submit to webmasters, robots will not be able to crawl any urls.
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 27, 2010
    /sitemap-base.xml.gz seems OK now, but no sign of galleries or image.

    Please can we have an update, content of index below.

    <sitemapindex xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
    &#8722;
    <sitemap>
    <loc>http://www.snrmac.com/sitemap-base.xml.gz</loc>
    <lastmod>2010-12-24T06:12:27Z</lastmod>
    </sitemap>
    </sitemapindex>
    
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 27, 2010
    socksiong wrote: »
    I just updated my keywords, uploaded a new gallery on 23rd. my sitemap is empty so even if I submit to webmasters, robots will not be able to crawl any urls.

    Hello,

    Which URL are you seeing that is empty? I fetched http://www.socksphotography.com/sitemap-index.xml and http://www.socksphotography.com/sitemap-base.xml - both seem to be fine.

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 27, 2010
    psenior1 wrote: »
    /sitemap-base.xml.gz seems OK now, but no sign of galleries or image.

    Please can we have an update, content of index below.

    <sitemapindex xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
    &#8722;
    <sitemap>
    <loc>http://www.snrmac.com/sitemap-base.xml.gz</loc>
    <lastmod>2010-12-24T06:12:27Z</lastmod>
    </sitemap>
    </sitemapindex>
    

    This is interesting and possibly a potential bug. You should probably only be adding sitemap-index.xml to Webmaster tools. But, it does seem odd to me that the sitemap-images would have gone away for you. I'll look into it and let you know.

    UPDATE: I think I've discovered a bug related to the disappearance of sitemap-index.xml files. More updates later.

    - Greg
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 28, 2010
    Twoofy wrote: »
    This is interesting and possibly a potential bug. You should probably only be adding sitemap-index.xml to Webmaster tools. But, it does seem odd to me that the sitemap-images would have gone away for you. I'll look into it and let you know.

    UPDATE: I think I've discovered a bug related to the disappearance of sitemap-index.xml files. More updates later.

    - Greg

    OK - Thanks. I do only add the Index, but shouldn't the index contain details of the base, galleries and images XML files (plus they dont seem to physically exist either)?

    EDIT: - just checked and this is the same for both of my Smug sites, so I'm assuming its affecting everyone.
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    denisegoldbergdenisegoldberg Administrators Posts: 14,246 moderator
    edited December 28, 2010
    Twoofy wrote: »
    ...You should probably only be adding sitemap-index.xml to Webmaster tools. But, it does seem odd to me that the sitemap-images would have gone away for you. I'll look into it and let you know.

    UPDATE: I think I've discovered a bug related to the disappearance of sitemap-index.xml files. More updates later.
    I'm seeing the same thing. I only added sitemap-index.xml; I see sitemap-index.xml and sitemap-base.xml.gz. No sitemap-images.

    --- Denise
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 28, 2010
    Hello,

    With respect to the sitemap-images and sitemap-galleries disappearing from the sitemap-index - that is fixed now. What was going on is that uploading a new photo to your gallery would cause them to drop out - until the process comes along that regenerates those sitemaps (the presumption being they were now out-of-date).

    What we are doing now is leaving the old sitemap-images/sitemap-galleries in-tact until the new ones get re-generated.

    It may take some time for you to see the net effect of the bug fix, but you can help it along by adding, changing, or deleting a keyword for one of your images (you only need to change one image, it will mark the sitemap as needing to be re-generated). Once the sitemap-images/sitemap-galleries are showing up in the sitemap index, you should be able to upload a new photo and it will not go away.

    I'm still testing all the edge cases here, but the fix looks solid.

    - Greg
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 30, 2010
    Twoofy wrote: »
    Hello,

    What we are doing now is leaving the old sitemap-images/sitemap-galleries in-tact until the new ones get re-generated.


    - Greg


    when will this be, or how often are they regenerated please?
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 30, 2010
    psenior1 wrote: »
    when will this be, or how often are they regenerated please?

    The code fix went live yesterday (we patched it). But, the process which regenerates the sitemaps has a lot of work to do, so I cannot estimate when it will get to a specific site exactly. Because of how our pages are structured/navigated Google can (and does) crawl them very well even without them. So we obviously want to err on the side of providing a fast user experience vs slowing everything down while sitemaps get generated. At the same time, we very much want the sitemaps in place because it helps Google find new or updated pages more quickly and just generally crawl the pages in a more efficient manner.

    Sorry I do not have a more precise answer for you on the timing of this.

    - Greg
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 30, 2010
    The good news - at this point, the errors are coming down in my GWMT dashboard, and I see updates to my sitemap there as well.

    The not so understood news - for some reason, my site is not coming up anymore when I search for it via google using "ashbrook photography". It was up to about the bottom of page 1, or top of page 2 until last week. I know - I can search using the fqdn, but customers arent going to do that.
  • Options
    dejavudejavu Registered Users Posts: 12 Big grins
    edited January 2, 2011
    In Google webmaster Sitemap-index.xml is now green with date 1/1/2011. When I click on it; the three files ending on *.gz are also green. clap.gif
    At this moment, I see that http://gianni.smugmug.com/sitemap-index.xml has a date of 31 dec. 2010.

    Crossing my fingers, but problem seems to be solved....
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited January 2, 2011
    still no sign of images or galleries for me ne_nau.gif
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
This discussion has been closed.