Options

Robots.txt Unreachable - Sitemaps

mshetzermshetzer Registered Users Posts: 22 Big grins
edited February 9, 2011 in SmugMug Support
In trying to get google webmaster tools up and running, and my site map complete, I recieve this error:

Network unreachable: robots.txt unreachable
We were unable to crawl your Sitemap because we found a robots.txt file at the root of your site but were unable to download it. Please ensure that it is accessible or remove it completely.

What else do I need to do?
http://shetzers.com

Thanks,
Matt
«1345

Comments

  • Options
    mshetzermshetzer Registered Users Posts: 22 Big grins
    edited December 6, 2010
    Lost another one
    Now I have two "x"'s in my site map? I'm losing ground.

    Any help would be appreciated !

    Matt
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 6, 2010
    Same problem here for over two weeks. I have a red x for the sitemaps and 503 service unreachable for the robot file. I've had 3 or 4 tickets and have just been told to wait.
  • Options
    mshetzermshetzer Registered Users Posts: 22 Big grins
    edited December 7, 2010
    Update
    Thanks Robert for the first response to this. I've been reading old postings from October that had this issue that it didn't seem resolved. I'm unable to get any SEO going, therefore I am not really getting anything out of my Smugmug account.

    Does anyone have any idea how long this takes to get working properly?

    Matt
    rashbrook wrote: »
    Same problem here for over two weeks. I have a red x for the sitemaps and 503 service unreachable for the robot file. I've had 3 or 4 tickets and have just been told to wait.
  • Options
    mshetzermshetzer Registered Users Posts: 22 Big grins
    edited December 9, 2010
    Time Frame
    Does anyone know how long it takes to resolve the robots.txt be found correctly by google?
    mshetzer wrote: »
    Thanks Robert for the first response to this. I've been reading old postings from October that had this issue that it didn't seem resolved. I'm unable to get any SEO going, therefore I am not really getting anything out of my Smugmug account.

    Does anyone have any idea how long this takes to get working properly?

    Matt
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 9, 2010
    I wish someone would respond too. AFter the tickets I've submitted, I feel pretty helpless. I look at webmaster tools ans see that the errors are mounting, and the red x hasnt changed. robots is still unreachable too.

    If I go to my sitemap, the data is all way out of date. Talking like - the beginning of October-ish. It's December.

    Trying to have faith in what support told me - to just wait, but, I would hope this gets better one day soon.
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 10, 2010
    same problem for me, very frustrating.
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 10, 2010
    I just wonder - how many weeks / months does it take for my site index to be updated in my sitemap? As I was saying before - the last time it was updated was only days after I began setup for my site back in the beginning of October. None of my current pages are listed in the file - aside from the fact that gwt says google cant retrieve/crawl it.

    Here's some of the info I have available in my gwt:

    Crawl errors


    In Sitemaps 2 Not followed 3 Timed out 7 Unreachable 1,727
    Sitemap Status URLs in web index /sitemap-index.xml
    error.png 0

    robots.txt file Downloaded Status http://www.ashbrook-photography.com/robots.txt 7 minutes ago 503 (Service unavailable)

    URL Googlebot type Status Date submitted url_icon.gifhttp://www.ashbrook-photography.com/robots.txt
    Web caution.pngMissing robots.txt
    12/10/10, 06:53 AM
    url_icon.gifhttp://www.ashbrook-photography.com/
    Web caution.pngMissing robots.txt
    12/7/10, 07:37 PM
    url_icon.gifhttp://www.ashbrook-photography.com/
    Web caution.pngMissing robots.txt
    12/2/10, 06:52 AM
    url_icon.gifhttp://www.ashbrook-photography.com/
    Web caution.pngMissing robots.txt
    12/1/10, 07:47 PM
    url_icon.gifhttp://www.ashbrook-photography.com/
    Web caution.pngMissing robots.txt
    11/30/10, 07:13 PM
    url_icon.gifhttp://www.ashbrook-photography.com/
    Web caution.pngMissing robots.txt
    11/30/10, 07:05 PM
    url_icon.gifhttp://www.ashbrook-photography.com/
    Web caution.pngMissing robots.txt
    11/30/10, 07:04 PM
    url_icon.gifhttp://www.ashbrook-photography.com/
    Web check.gifSuccess
    11/28/10, 05:44 PM
  • Options
    candroncandron Registered Users Posts: 6 Beginner grinner
    edited December 11, 2010
    Sitemap still not updated - robots.txt unreachable
    Same issues here: Sitemap status with a red cross and robots.txt unreachable. Sitemap-base.xml has not been updated since I first opened my account November 18th. So, it's like 3 weeks now that Google is trying to index my site with no success (I've obviously submitted my site to google and went through all the relevant posts on Dgrin from October 2010).

    I feel that these issues should have been resolved much faster. With a premium of $150 for a Pro account I would expect that these glitches should at least be acknowledged and would like to hear some course of action from Smugmug's side.

    I'm attaching some screenshots of the respective Webmaster tool pages.


    No SEO, no fun!

    - Christos
  • Options
    candroncandron Registered Users Posts: 6 Beginner grinner
    edited December 11, 2010
    candron wrote: »
    Same issues here: Sitemap status with a red cross and robots.txt unreachable. Sitemap-base.xml has not been updated since I first opened my account November 18th. So, it's like 3 weeks now that Google is trying to index my site with no success (I've obviously submitted my site to google and went through all the relevant posts on Dgrin from October 2010).

    I feel that these issues should have been resolved much faster. With a premium of $150 for a Pro account I would expect that these glitches should at least be acknowledged and would like to hear some course of action from Smugmug's side.

    I'm attaching some screenshots of the respective Webmaster tool pages.


    No SEO, no fun!

    - Christos

    ... the second screenshot (sorry, I didn't know how to attach two screenshots in the same post)
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 11, 2010
    I've got over 2000 unreachable links now, also just noticed that for the first time in ages my site has dropped off google organic search page 1, I dont know enough about SEO to know if its related.
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    LichtenHansenLichtenHansen Registered Users Posts: 55 Big grins
    edited December 11, 2010
    Smugmug sitemap service down
    This is how the Smugmug Status page should look like:
    smugmug-status-sitemap.jpg
  • Options
    canghuixucanghuixu Registered Users Posts: 238 Major grins
    edited December 11, 2010
    There's definitely something going on. I have noticed a big drop off in traffic at my site since the end of November. When I looked at Statcounter, and the Smugmug referrer stats, I noticed that there were much fewer Google referrals than their used to be. I poked around and found this thread. Based on the discussion here, I went and looked at the Webmaster tools results. Indeed, there have been a huge number of crawl errors since November. Most of these are 'robots.txt unreachable' errors but some are 503 errors. When I looked at the crawl stats, it looks like my site has barely been crawled by Google since November. I have written to Smugmug help, and provided screenshots of the crawl stats, and the downloaded .csv files of the crawl errors. They are usually pretty responsive so we'll see. I do hope this gets resolved since the fall-off in traffic at my site is pretty noticeable. I don't think it is anything at my end since I haven't tinkered with any customization for quite a while.
  • Options
    denisegoldbergdenisegoldberg Administrators Posts: 14,240 moderator
    edited December 11, 2010
    I'm seeing errors on my site too - robots.txt unreachable:
    Network unreachable: robots.txt unreachable
    We were unable to crawl your Sitemap because we found a robots.txt file at the root of your site but were unable to download it. Please ensure that it is accessible or remove it completely.
    1122984886_TSHkF-XL.jpg

    Smug support wizards - what's going on here???

    Or is this signalling that Google can find my galleries but not my images?

    --- Denise
  • Options
    OffTopicOffTopic Registered Users Posts: 521 Major grins
    edited December 11, 2010
    Here's what I have. As with everyone else, most crawl errors are due to robots.txt unreachable but there are also several 503 server errors as well.


    1122954748_7Hhs8-L.jpg


    I submitted my sitemap per directions in a thread on October 27 and have just been monitoring the situation since we've been told it takes some time, but like others have mentioned the crawlstats for my site show ZERO for the month of December so far so I figured I'd better speak up.

    I see that Google shows it downloaded a galleries sitemap on its own prior to the date I submitted one, and that is apparently the only one that is working correctly.
  • Options
    candroncandron Registered Users Posts: 6 Beginner grinner
    edited December 12, 2010
    This is how the Smugmug Status page should look like:
    smugmug-status-sitemap.jpg

    Allan,

    At the http://status.smugmug.com/ url I'm getting exactly the same table you have, where everything's green, however, I don't get the Sitemap row at all. Any idea why is that?

    Thank you,

    - Christos

    christosandronis.smugmug.com
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 12, 2010
    be nice to get a reply on this please SM support, Jan/Feb is the busiest time for us for wedding bookings, so if this is severly impacting our google organic search position then this is pretty serious.

    1123340674_nuZZd-L.jpg
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 12, 2010
    So, we all have the same problems. Are we all submitting support tickets to make sure smugmug is aware? I've submitted 3 or 4 so far. I will submit another this week. I'm honestly not sure they are clear that there is an issue at all - based on the responses I've gotten, and that there has been no response here.

    Please submit support tickets to make sure there is awareness. If they keep hearing from me they'll start to think 'its just that nutty guy out east again. - everyone else is fine'

    I looked just now and my unreachable errors are in the 2000's. My sitemap still hasn't been updated since 10/04, still the sitemap red x on google, and still the 503 error for robots.txt. I had only 9 hits on my site today - all time low so far.
  • Options
    richpepprichpepp Registered Users Posts: 360 Major grins
    edited December 13, 2010
    There really does appear to be something funny happening here and it seems to be related specifically to Google for some reason. I wonder if it is to do with the special view that google bots get of our sites?

    As others have noticed Google webmaster tools reports as issue with robots.txt and it seems that their default position may then not be to index as they can't tell what the intention of the file was (I can't be sure of thise as I couldn't find an actual Google doc on this - only other reports). This doesn't appear to be random as using the 'Fetch as googlebot' tool seems to always fail due to an error in robots.txt

    However robots.txt does appear to really exist. Other sites on the web will show it for me and Bing appears to correctly read it as I can find a cached copy of our site from yesterday (10th Dec) on Bing

    Google however doesn't have any cached copy of our site since the end of November

    This doesn't appear to be a webmaster tools issue as the other parts of our site which aren't SmugMug don't have this problem.

    I have no idea why this is, I'm just hoping to add to the picture a little ne_nau.gif. The error reported is a 503 which seems to suggest that smugmug has a problem when sending the robots.txt - but only to google
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 13, 2010
    I've emailed the helpdesk too, plus I've also just checked my other SM site www.funkidsphotos.com and it has exactly the same issues.
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 13, 2010
    Hello,

    I have been looking at this problem for several days and have sorted out what is going on with this problem regarding the robots.txt and sitemap files. Google should refetch your robots.txt sometime in the next 24 hours and everything should free up.

    If you want to test this for yourselves, you can go into your Google Webmaster tools then navigate to "Labs > Fetch as Googlebot". For the URL type "robots.txt" and leave the selector on "Web". When you submit it, the status will be "pending" and within a few seconds (you may need to refresh the page), the status should be "Success!" - you can then click it and see the contents of that file. I will monitor this thread for the next couple of days - if you want to try it and post your results here.

    I am still working on the issue with individual URLs within the sitemap not being escaped properly, my expectation is that this fix will go live around Thursday this week (12/16).

    Thank you to everyone for letting me know about this and your patience. A special thank you to the Smugmug user who graciously gave me access to your Google account, your kindness saved me a weekend's worth of work (on my birthday even) tracking this down. :)

    - Greg
  • Options
    candroncandron Registered Users Posts: 6 Beginner grinner
    edited December 13, 2010
    Twoofy wrote: »
    Hello,

    I have been looking at this problem for several days and have sorted out what is going on with this problem regarding the robots.txt and sitemap files. Google should refetch your robots.txt sometime in the next 24 hours and everything should free up.

    If you want to test this for yourselves, you can go into your Google Webmaster tools then navigate to "Labs > Fetch as Googlebot". For the URL type "robots.txt" and leave the selector on "Web". When you submit it, the status will be "pending" and within a few seconds (you may need to refresh the page), the status should be "Success!" - you can then click it and see the contents of that file. I will monitor this thread for the next couple of days - if you want to try it and post your results here.

    I am still working on the issue with individual URLs within the sitemap not being escaped properly, my expectation is that this fix will go live around Thursday this week (12/16).

    Thank you to everyone for letting me know about this and your patience. A special thank you to the Smugmug user who graciously gave me access to your Google account, your kindness saved me a weekend's worth of work (on my birthday even) tracking this down. :)

    - Greg

    Thanks for looking at it Greg,

    - Christos

    christosandronis.smugmug.com
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 13, 2010
    thanks for the reply and looking at the problem, the fetch as googlebot instructions you mention worked OK and the 'base' sitemap has reloaded OK, but the index/galleries/image sitemaps are still in pending status and have been for the last 30 minutes. Will this all be fixed later this week?
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 13, 2010
    psenior1 wrote: »
    thanks for the reply and looking at the problem, the fetch as googlebot instructions you mention worked OK and the 'base' sitemap has reloaded OK, but the index/galleries/image sitemaps are still in pending status and have been for the last 30 minutes. Will this all be fixed later this week?

    This is more in the domain of Google's algorithms. My experience is that it usually takes a few days for the sitemaps to get fetched and processed, and the bots start coming to crawl the new or updated URLs. There's a bit of "black box" magic that happens at Google behind the scenes on this and a lot of different factors go into how frequently they will crawl and/or index new content. I'm a little worried about saying "X is going to happen by Y date" because what Google does with the sitemaps is obviously out of our control.

    Bottom line is: Pending status is good. That means Google was able to pull the sitemaps and is sorting out what to do next. When they come out of Pending status if you see some yellow triangles with "!" marks on them for some URLs, do not be too alarmed - that is the URL escaping issue I mentioned.

    - Greg
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 13, 2010
    Twoofy wrote: »
    This is more in the domain of Google's algorithms. My experience is that it usually takes a few days for the sitemaps to get fetched and processed, and the bots start coming to crawl the new or updated URLs. There's a bit of "black box" magic that happens at Google behind the scenes on this and a lot of different factors go into how frequently they will crawl and/or index new content. I'm a little worried about saying "X is going to happen by Y date" because what Google does with the sitemaps is obviously out of our control.

    Bottom line is: Pending status is good. That means Google was able to pull the sitemaps and is sorting out what to do next. When they come out of Pending status if you see some yellow triangles with "!" marks on them for some URLs, do not be too alarmed - that is the URL escaping issue I mentioned.

    - Greg


    they are in "!" status now, thanks again for your help so far.
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 13, 2010
    psenior1 wrote: »
    they are in "!" status now, thanks again for your help so far.

    Just FYI, the "!" status (if you click it) probably just means that there are some URLs in the sitemap that were not correctly formatted - so Google could not fetch them. If you can verify that is what you are seeing I'd really appreciate it :)

    - Greg
  • Options
    psenior1psenior1 Registered Users Posts: 125 Major grins
    edited December 13, 2010
    Twoofy wrote: »
    Just FYI, the "!" status (if you click it) probably just means that there are some URLs in the sitemap that were not correctly formatted - so Google could not fetch them. If you can verify that is what you are seeing I'd really appreciate it :)

    - Greg

    the index is stil pending, the base has a tick and galleries/images has "!" Going into galleries/images the error message is -

    "
    URLs unreachable
    When we tested a sample of the URLs from your Sitemap, we found that some of the URLs were unreachable. Please check your webserver for possible misconfiguration, as these errors may be caused by a server error (such as a 5xx error) or a network error between Googlebot and your server. All reachable URLs will still be submitted."
    website - http://www.snrmac.com
    facebook - my facebook page please LIKE me!
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 13, 2010
    That's strange - I just thought I'd have a look at my Google webmaster tools - and walla - green checkmark, and robots is working again.

    Now - when will my sitemap be updated to reflect my site as it is today? It is dated as being updated (with now incorrect information) the last time on 10/04/10.

    It's December 13th.

    Will my sitemap update at some point to have the correct pages indexed? If so, how long?
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 13, 2010
    rashbrook wrote: »
    That's strange - I just thought I'd have a look at my Google webmaster tools - and walla - green checkmark, and robots is working again.

    Now - when will my sitemap be updated to reflect my site as it is today? It is dated as being updated (with now incorrect information) the last time on 10/04/10.

    It's December 13th.

    Will my sitemap update at some point to have the correct pages indexed? If so, how long?

    Hi Robert,

    Once google pulls your sitemap file it goes into kind of a "black box" from our perspective. And for some reason there seems to be kind of a habitual problem around the data in webmaster tools not reflecting reality. For example, one of the my sites has roughly 8,000 pages indexed in Google - but webmaster tools says there are only 136 and that it last fetched the sitemap files months ago (even though I can clearly see in my logs it fetches them several times a day).

    That being said, I'd give it a few days - if not a week. What I think you want to be seeing here is that everything is working (no red X's, no "!" warnings) and that will tell you that Google was able to pull the data into their systems where they are being processed.

    I'm not sure how many pages webmaster tools is saying have been indexed - but according to Google's web search interface you have about 1,200: http://www.google.com/search?client=...UTF-8&oe=UTF-8

    Hope this helps a little - at least in helping to explain where the boundary line is between Smugmug's systems and Google's.

    - Greg
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 13, 2010
    Hi Greg,

    Thanks for your message and all of the info. Maybe the forum is best - since it helps others.

    What I meant to say, is that when you go to: http://www.ashbrook-photography.com/sitemap-index.xml and look at the file itself - you can see that the last update to that file itself (by smugmug) was 10/04 - and that none of my current pages are indexed in my sitemap itself.

    So I guess thats more the question - is it reasonable that on the smugmug side - my sitemap.xml file hasnt been updated in more than two months? If so, how long does it take to update that file with current info?

    Thanks again - Robert
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited December 14, 2010
    OK, today I didn't wake up with the red x's. clap.gif

    I did however wake up with the ! yellow triangles.
    From the previous posts, I assume this is just items being released.

    1126099193_Y9bgD-L.jpg

    What concerns me is the sitemap images shows "0" in the URL's in web index and has since day one.

    I have found very few images of mine on google with my smug site address, I have however found many from forums such as Dgrin and P.O.T.N..
    I suspect it has something to do with the fact that "0" images appear in the web index.

    Am I wrong in this assumption???

    You are not wrong in your assumptions, but Google does take some time to crawl and process a site. I do see there are 1870 pages in Google's main index (http://www.google.com/search?client=safari&rls=en&q=site:ShawnKrausPhoto.com&ie=UTF-8&oe=UTF-8) - how many images do you have or would you expect to show up in that query? This exemplifies the issue I was alluding to earlier where the webmaster tools statistics are chronically out-of-date.

    Once the "!" error goes away, we should give some time for Google to sort itself out and see how the issue evolves. This type of thing is less like fixing a car, and more like baking a cake :)

    - Greg
This discussion has been closed.