robots.txt 503 Service unavailable error

rashbrook · November 30, 2010

So today, google began to finally populate some data in webmaster tools with regard to my site, but it was mostly errors.

When I looked at the site configuration section I saw that it says :

robots.txt file Downloaded Status http://www.ashbrook-photography.com/robots.txt 57 minutes ago 503 (Service unavailable)

and although the file is accessible at the link above, there is nothing in the text box which normally would display the robots.txt data.

The only change I made to the site in the last 24hrs was to change my A record to the proper IP address. I had for someo reason set it to a different smugmug address way back when I did the initial setup. (I do have the cname setup but I had set up the A record for the ability to operate w/o the www)

but other than that, nothing has changed. Since robots txt can inhibit googles ability to crawl my site, this concerns me. What can I do?

Thanks,

-Robert

Andy · November 30, 2010

I can see the link just fine, is it working for you? We had no service interruptions today. Let me know?

Andy · November 30, 2010

I can see the link just fine, is it working for you? We had no service interruptions today. Let me know?

rashbrook · November 30, 2010

Thats the strange thing - I can see the link when I click on it too - but google is reporting the 503 service error within webmaster tools.

candron · December 1, 2010

I am a new Smugmug subscriber. I am getting exactly the same problem with Google webmaster tools - 503 (Service unavailable), although, I can get my robots.txt file at http://christosandronis.smugmug.com/robots.txt. Any ideas why is this happening?

Thanks,

- Christos

rainforest1155 · December 1, 2010

Rashbrook and Christos,

That's nothing to worry about. It's likely just a temporary glitch that the robots file wasn't available at the time the Google crawler came by. Google will simply retry at a later time and then the message will disappear. You can safely ignore this one.

Sebastian

rashbrook · December 1, 2010

Well, I'm still getting 503 (Service unavailable) after 3 days for robots.txt, and now, I'm getting a big red X on my sitemaps too.

I sent an email to support but I just got back a message saying they could reach the file. I'll give it more time but, I hope someone will be willing to take it a little further if this continues and I need to ask for support in a few days.

markymark · May 18, 2011

rashbrook wrote: »

Well, I'm still getting 503 (Service unavailable) after 3 days for robots.txt, and now, I'm getting a big red X on my sitemaps too.

I sent an email to support but I just got back a message saying they could reach the file. I'll give it more time but, I hope someone will be willing to take it a little further if this continues and I need to ask for support in a few days.

I'm getting the same issue as what you're mentioning.

Is there a way I can somehow edit the robots.txt file that seems to be automatically generated by Smugmug?

When I try submitting my RSS feed as as sitemap, google is tell me that this is being restricted by robots.txt

Andy · May 18, 2011

markymark wrote: »

I'm getting the same issue as what you're mentioning.

Is there a way I can somehow edit the robots.txt file that seems to be automatically generated by Smugmug?

When I try submitting my RSS feed as as sitemap, google is tell me that this is being restricted by robots.txt

you don't want / need to submit an rss feed as a sitemap

cman · May 19, 2011

I kindly ask You to show me at least one solid site, where would be prohibited an indexing of feed (RSS).

Twoofy · May 19, 2011

rashbrook wrote: »

Well, I'm still getting 503 (Service unavailable) after 3 days for robots.txt, and now, I'm getting a big red X on my sitemaps too.

I sent an email to support but I just got back a message saying they could reach the file. I'll give it more time but, I hope someone will be willing to take it a little further if this continues and I need to ask for support in a few days.

The red X's are expected. You should see some red X's that should get replaced with new files that do not have them. The names of the sitemap files were changed, in other words. So the old sitemap files are going away and have been replaced by files with new names.

As long as you have added sitemap-index.xml.gz to Webmaster Tools you will be fine, this will clear itself out.

- Greg

Twoofy · May 19, 2011

markymark wrote: »

I'm getting the same issue as what you're mentioning.

Is there a way I can somehow edit the robots.txt file that seems to be automatically generated by Smugmug?

When I try submitting my RSS feed as as sitemap, google is tell me that this is being restricted by robots.txt

The RSS format is not the best thing to be submitting to webmaster tools. Submit sitemap-index.xml.gz instead and Google will automatically pick up all of the sitemap files that we provide.

- Greg

Twoofy · May 19, 2011

cman wrote: »

I kindly ask You to show me at least one solid site, where would be prohibited an indexing of feed (RSS).

http://www.amazon.com/robots.txt

Disallow: /rss/people/*/reviews
Disallow: /gp/pdp/rss/*/reviews

... And they were the first one I picked...

- Greg

cabbey · May 19, 2011

cman wrote: »

I kindly ask You to show me at least one solid site, where would be prohibited an indexing of feed (RSS).

http://facebook.com/robots.txt

Disallow: /feeds/

cman · May 19, 2011

cabbey wrote: »

http://facebook.com/robots.txt

Disallow: /feeds/

Click on http://www.facebook.com/feeds/ - 404!

cman · May 19, 2011

Twoofy wrote: »

http://www.amazon.com/robots.txt

Disallow: /rss/people/*/reviews
Disallow: /gp/pdp/rss/*/reviews

... And they were the first one I picked...

- Greg

Please:

1. Show sites, related to this feeds.
2. At the same time look at these lines in robots.txt:

Sitemap: http://www.amazon.com/sitemap-manual-index.xml
Sitemap: http://www.amazon.com/sitemap_dp_index.xml
Sitemap: http://www.amazon.com/sitemap_vendor_videos_us.xml
Sitemap: http://www.amazon.com/sitemap_vod_index.xml

This is just normal sitemaps.

cabbey · May 19, 2011

cman wrote: »

Click on http://www.facebook.com/feeds/ - 404!

That just tells me they don't have an index handler on that url.... what does that have to do with them disallowing robots to crawl their feeds?

cman · May 19, 2011

cabbey wrote: »

That just tells me they don't have an index handler on that url.... what does that have to do with them disallowing robots to crawl their feeds?

Sorry, but I did not understand what you said.

Twoofy · May 19, 2011

cman wrote: »

Click on http://www.facebook.com/feeds/ - 404!

Facebook feeds sit underneath the /feeds URL. They block them all. Find a facebook feed URL (I do not know one of the top of my head) and it will be underneath that URL. Maybe something like http://www.facebook.com/feeds/somefeed.rss (not a valid one I'm sure) and any bot that crawls it will be violating the rules of robots.txt.

- Greg

Twoofy · May 19, 2011

cman wrote: »

Please:

1. Show sites, related to this feeds.
2. At the same time look at these lines in robots.txt:

Sitemap: http://www.amazon.com/sitemap-manual-index.xml
Sitemap: http://www.amazon.com/sitemap_dp_index.xml
Sitemap: http://www.amazon.com/sitemap_vendor_videos_us.xml
Sitemap: http://www.amazon.com/sitemap_vod_index.xml

This is just normal sitemaps.

You are making my point. These are sitemap indexes. We do not compress these either (though we could).

Lets look at http://www.amazon.com/sitemap_dp_index.xml for a minute:

<loc>http://www.amazon.com/sitemap_dp_00001.xml.gz</loc>
<lastmod>2010-03-29</lastmod>
</sitemap>
<sitemap>
<loc>http://www.amazon.com/sitemap_dp_00002.xml.gz</loc>
<lastmod>2010-03-29</lastmod>
</sitemap>
.. (etc)...

Satisfied?

- Greg

robots.txt 503 Service unavailable error

Comments