Options

robots.txt 503 Service unavailable error

rashbrookrashbrook Registered Users Posts: 92 Big grins
edited May 19, 2011 in SmugMug Support
So today, google began to finally populate some data in webmaster tools with regard to my site, but it was mostly errors.

When I looked at the site configuration section I saw that it says :

robots.txt file Downloaded Status http://www.ashbrook-photography.com/robots.txt 57 minutes ago 503 (Service unavailable)

and although the file is accessible at the link above, there is nothing in the text box which normally would display the robots.txt data.

The only change I made to the site in the last 24hrs was to change my A record to the proper IP address. I had for someo reason set it to a different smugmug address way back when I did the initial setup. (I do have the cname setup but I had set up the A record for the ability to operate w/o the www)

but other than that, nothing has changed. Since robots txt can inhibit googles ability to crawl my site, this concerns me. What can I do?

Thanks,

-Robert

Comments

  • Options
    AndyAndy Registered Users Posts: 50,016 Major grins
    edited November 30, 2010
    I can see the link just fine, is it working for you? We had no service interruptions today. Let me know?
  • Options
    AndyAndy Registered Users Posts: 50,016 Major grins
    edited November 30, 2010
    I can see the link just fine, is it working for you? We had no service interruptions today. Let me know?
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited November 30, 2010
    Thats the strange thing - I can see the link when I click on it too - but google is reporting the 503 service error within webmaster tools.
  • Options
    candroncandron Registered Users Posts: 6 Beginner grinner
    edited December 1, 2010
    I am a new Smugmug subscriber. I am getting exactly the same problem with Google webmaster tools - 503 (Service unavailable), although, I can get my robots.txt file at http://christosandronis.smugmug.com/robots.txt. Any ideas why is this happening?

    Thanks,

    - Christos
  • Options
    rainforest1155rainforest1155 Registered Users Posts: 4,566 Major grins
    edited December 1, 2010
    Rashbrook and Christos,

    That's nothing to worry about. It's likely just a temporary glitch that the robots file wasn't available at the time the Google crawler came by. Google will simply retry at a later time and then the message will disappear. You can safely ignore this one.

    Sebastian
    Sebastian
    SmugMug Support Hero
  • Options
    rashbrookrashbrook Registered Users Posts: 92 Big grins
    edited December 1, 2010
    Well, I'm still getting 503 (Service unavailable) after 3 days for robots.txt, and now, I'm getting a big red X on my sitemaps too.

    I sent an email to support but I just got back a message saying they could reach the file. I'll give it more time but, I hope someone will be willing to take it a little further if this continues and I need to ask for support in a few days.
  • Options
    markymarkmarkymark Registered Users Posts: 34 Big grins
    edited May 18, 2011
    rashbrook wrote: »
    Well, I'm still getting 503 (Service unavailable) after 3 days for robots.txt, and now, I'm getting a big red X on my sitemaps too.

    I sent an email to support but I just got back a message saying they could reach the file. I'll give it more time but, I hope someone will be willing to take it a little further if this continues and I need to ask for support in a few days.

    I'm getting the same issue as what you're mentioning.

    Is there a way I can somehow edit the robots.txt file that seems to be automatically generated by Smugmug?

    When I try submitting my RSS feed as as sitemap, google is tell me that this is being restricted by robots.txt
  • Options
    AndyAndy Registered Users Posts: 50,016 Major grins
    edited May 18, 2011
    markymark wrote: »
    I'm getting the same issue as what you're mentioning.

    Is there a way I can somehow edit the robots.txt file that seems to be automatically generated by Smugmug?

    When I try submitting my RSS feed as as sitemap, google is tell me that this is being restricted by robots.txt

    you don't want / need to submit an rss feed as a sitemap :)
  • Options
    cmancman Registered Users Posts: 75 Big grins
    edited May 19, 2011
    I kindly ask You to show me at least one solid site, where would be prohibited an indexing of feed (RSS).
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 19, 2011
    rashbrook wrote: »
    Well, I'm still getting 503 (Service unavailable) after 3 days for robots.txt, and now, I'm getting a big red X on my sitemaps too.

    I sent an email to support but I just got back a message saying they could reach the file. I'll give it more time but, I hope someone will be willing to take it a little further if this continues and I need to ask for support in a few days.

    The red X's are expected. You should see some red X's that should get replaced with new files that do not have them. The names of the sitemap files were changed, in other words. So the old sitemap files are going away and have been replaced by files with new names.

    As long as you have added sitemap-index.xml.gz to Webmaster Tools you will be fine, this will clear itself out.

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 19, 2011
    markymark wrote: »
    I'm getting the same issue as what you're mentioning.

    Is there a way I can somehow edit the robots.txt file that seems to be automatically generated by Smugmug?

    When I try submitting my RSS feed as as sitemap, google is tell me that this is being restricted by robots.txt

    The RSS format is not the best thing to be submitting to webmaster tools. Submit sitemap-index.xml.gz instead and Google will automatically pick up all of the sitemap files that we provide.

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 19, 2011
    cman wrote: »
    I kindly ask You to show me at least one solid site, where would be prohibited an indexing of feed (RSS).

    http://www.amazon.com/robots.txt

    Disallow: /rss/people/*/reviews
    Disallow: /gp/pdp/rss/*/reviews

    ... And they were the first one I picked...

    - Greg
  • Options
    cabbeycabbey Registered Users Posts: 1,053 Major grins
    edited May 19, 2011
    cman wrote: »
    I kindly ask You to show me at least one solid site, where would be prohibited an indexing of feed (RSS).

    http://facebook.com/robots.txt

    Disallow: /feeds/
    SmugMug Sorcerer - Engineering Team Champion for Commerce, Finance, Security, and Data Support
    http://wall-art.smugmug.com/
  • Options
    cmancman Registered Users Posts: 75 Big grins
    edited May 19, 2011
    cabbey wrote: »

    Click on http://www.facebook.com/feeds/ - 404!
  • Options
    cmancman Registered Users Posts: 75 Big grins
    edited May 19, 2011
    Twoofy wrote: »
    http://www.amazon.com/robots.txt

    Disallow: /rss/people/*/reviews
    Disallow: /gp/pdp/rss/*/reviews

    ... And they were the first one I picked...

    - Greg

    Please:

    1. Show sites, related to this feeds.
    2. At the same time look at these lines in robots.txt:

    Sitemap: http://www.amazon.com/sitemap-manual-index.xml
    Sitemap: http://www.amazon.com/sitemap_dp_index.xml
    Sitemap: http://www.amazon.com/sitemap_vendor_videos_us.xml
    Sitemap: http://www.amazon.com/sitemap_vod_index.xml

    This is just normal sitemaps.
  • Options
    cabbeycabbey Registered Users Posts: 1,053 Major grins
    edited May 19, 2011
    cman wrote: »

    That just tells me they don't have an index handler on that url.... what does that have to do with them disallowing robots to crawl their feeds?
    SmugMug Sorcerer - Engineering Team Champion for Commerce, Finance, Security, and Data Support
    http://wall-art.smugmug.com/
  • Options
    cmancman Registered Users Posts: 75 Big grins
    edited May 19, 2011
    cabbey wrote: »
    That just tells me they don't have an index handler on that url.... what does that have to do with them disallowing robots to crawl their feeds?
    Sorry, but I did not understand what you said.
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 19, 2011
    cman wrote: »

    Facebook feeds sit underneath the /feeds URL. They block them all. Find a facebook feed URL (I do not know one of the top of my head) and it will be underneath that URL. Maybe something like http://www.facebook.com/feeds/somefeed.rss (not a valid one I'm sure) and any bot that crawls it will be violating the rules of robots.txt.

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 19, 2011
    cman wrote: »
    Please:

    1. Show sites, related to this feeds.
    2. At the same time look at these lines in robots.txt:

    Sitemap: http://www.amazon.com/sitemap-manual-index.xml
    Sitemap: http://www.amazon.com/sitemap_dp_index.xml
    Sitemap: http://www.amazon.com/sitemap_vendor_videos_us.xml
    Sitemap: http://www.amazon.com/sitemap_vod_index.xml

    This is just normal sitemaps.


    You are making my point. These are sitemap indexes. We do not compress these either (though we could).

    Lets look at http://www.amazon.com/sitemap_dp_index.xml for a minute:
    <loc>http://www.amazon.com/sitemap_dp_00001.xml.gz</loc>
    <lastmod>2010-03-29</lastmod>
    </sitemap>
    <sitemap>
    <loc>http://www.amazon.com/sitemap_dp_00002.xml.gz</loc>
    <lastmod>2010-03-29</lastmod>
    </sitemap>
    .. (etc)...
    

    Satisfied? :)

    - Greg
Sign In or Register to comment.