Got a sitemap question? Look here.



  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 19, 2011
    cman wrote: »
    1. In this case, SM is the only site that makes an sitemap extraordinary in format xml.gz, and thinks that ALL robots will crawl it.

    2. About how many errors occur when submit the above sitemap to webmasters tools, you can read here:

    or here:,or.r_gc.r_pw.&fp=bc4fbe72e0bc93d6

    That is just not true. Sitemaps are *designed* to be compressed - it is part of the spec. There is nothing extraordinary about this, its been being done this way for many, many years. If folks writing search engines do not want to follow the specification of the protocol then they can crawl the site the old-fashioned way.

    The reason for any relatively recent problems with Google has to do with the fact we renamed some of the sitemap files themselves - not because of any compressed vs uncompressed thing. The old files are getting red X's because they don't exist and the new ones are being processed just fine. If you scroll back one page on this thread (page 8) you can see the screen shots from Bob_A that show this. If Google was not able to process compressed sitemap files they would not show checkboxes.

    If you can find a search engine that supports sitemaps but not compressed ones, let me know, I will be happy to contact them.

    - Greg
  • WinsomeWorksWinsomeWorks Registered Users Posts: 1,935 Major grins
    edited May 20, 2011
    cman wrote: »
    1. In this case, SM is the only site that makes an sitemap extraordinary in format xml.gz, and thinks that ALL robots will crawl it.

    2. About how many errors occur when submit the above sitemap to webmasters tools, you can read here:

    or here:,or.r_gc.r_pw.&fp=bc4fbe72e0bc93d6
    Your first link simply goes right back to this thread, so we don't know which post you're wanting us to see. Your second post actually does not support your argument at all. You evidently didn't read enough of that thread... even from reading a few of the posts, I could see that your complaint about the compressed sitemap format is not supported at all in what that thread says. In that example, the OP's problem was not the compression or xml.gz at all-- it was something else. That person even said their site was being crawled (or whatever) just fine. Quote: "...When I go to my reports though it shows success...." Anyway, did you not see Greg's (Twoofy's) post #151 here in this thread? If you look at the sitemaps info., it completely supports what he's telling you about it. I would think just might be able to be trusted on this, right???!! deal.gif If you're not going to trust SmugMug, at least trust that. :D
    Anna Lisa Yoder's Images - ... Handmade Photo Notecards: ... Framed/Matted work: ... Scribbles:
    DayBreak, my Folk Music Group (some free mp3s!)
  • Bob_ABob_A Registered Users Posts: 92 Big grins
    edited May 20, 2011
    Twoofy wrote: »
    Hi Bob,

    The sitemap-index at has links to the following sitemaps:
    I checked all 3 of those and they appear to be okay from all the tests I have run. You should have these three showing up (hopefully with no red X's) and anything else should be giving you a 404 error. The 404 error is basically our server saying that the files do not exist anymore, which they do not. It just takes Google a while to realize that it is intentional.

    Make sure you have submitted only the sitemap-index.xml.gz file and it should pick these up and eventually the old ones should go away.

    - Greg

    Hopefully this will be my last (dumb) question for awhile :)

    I deleted all of my sitemaps that I submitted then resubmitted sitemap-index.xml.gz and it gives me a red X. However, I noticed that while there was 1 submission "by me" there is a button that says "all 2". When I click that, sitemap-index.xml shows up. If I then click sitemap-index.xml the three sitemaps you referenced above are displayed.

    Does this mean that Google found my sitemap(s) and I shouldn't be submitting manually? Just wondering if the red X for the sitemap-index.xml.gz file is because the resulting sitemaps are already being used.
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 20, 2011
    Bob_A wrote: »
    Hopefully this will be my last (dumb) question for awhile :)

    I deleted all of my sitemaps that I submitted then resubmitted sitemap-index.xml.gz and it gives me a red X. However, I noticed that while there was 1 submission "by me" there is a button that says "all 2". When I click that, sitemap-index.xml shows up. If I then click sitemap-index.xml the three sitemaps you referenced above are displayed.

    Does this mean that Google found my sitemap(s) and I shouldn't be submitting manually? Just wondering if the red X for the sitemap-index.xml.gz file is because the resulting sitemaps are already being used.

    Yeah, I'd actually pull out sitemap-index.xml.gz and leave it at sitemap-index.xml because that is the one we put in the robots.txt. Its not hurting anything, but its not helping either.

    - Greg

    P.S. It is a delight to answer your questions, keep them coming if you have any more :)
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 21, 2011
    pgman wrote: »

    I'm not complaining but just trying to figure out what's happening: :: <lastmod> 2011-05-13T07:19:52Z <lastmod> 2011-05-13T07:19:51Z

    This was as of 18-May-2011 12:08pm PDT.

    In the last 5 days, I've upload a dozen photos, done gallery changes, changed descriptions... What's the expected turn around after having done changes?


    Just FYI, I believe that your sitemap has been rebuilt - I just checked the queue and do not see it there. Let me know if you are still seeing a problem with it.

    - Greg
  • OnFirePhotographyOnFirePhotography Registered Users Posts: 71 Big grins
    edited May 21, 2011
    ive posted this screen shot again as i didnt get any response last time
    But do i need the 2 sitemaps with cross's, what is the reason for the cross's, does it mean the sitemaps or wrong or havent been read yet? headscratch.gif

  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 21, 2011
    ive posted this screen shot again as i didnt get any response last time
    But do i need the 2 sitemaps with cross's, what is the reason for the cross's, does it mean the sitemaps or wrong or havent been read yet? headscratch.gif


    Oops, sorry I did not see your original message. If you click sitemap-index you should see several sitemaps listed underneath it. The ones with the red X's are okay, those do not exist any more as they have been renamed.

    I do see that it says "May 10th" as the last date they were downloaded. You may want to change something on your site (upload a new photo, change a keyword, etc) to trigger a sitemap update if you haven't done that recently. Otherwise it looks like everything is okay - the red X's are expected on those two files and will soon go away.

    - Greg
  • Bob_ABob_A Registered Users Posts: 92 Big grins
    edited May 25, 2011
    Another question (an I apologize if this is the wrong thread).

    Webmaster Tools is reporting the following crawl errors:

    HTTP ‎(1) 403 error May 7, 2011

    In Sitemaps ‎(2)‎ 503 error unavailable Apr 19, 2011 truck 503 error unavailable Apr 18, 2011

    Not followed ‎(1)‎ Redirect error May 10, 2011

    Not found ‎(1)‎ 404 (Not found)
    19 pages May 22, 2011

    Unreachable ‎(5)‎ 503 error Apr 19, 2011 503 error Apr 19, 2011 truck 503 error Apr 18, 2011 robots.txt unreachable Apr 15, 2011 robots.txt unreachable Apr 15, 2011

    I'm also getting a bunch for "Restricted by Robots.txt (131)", but these I'd expect.

    Do I need to be concerned about any of the above, and if so how do I correct the errors?
  • OnFirePhotographyOnFirePhotography Registered Users Posts: 71 Big grins
    edited May 25, 2011
    i have the include one too from 24th April dont know what it means though
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 26, 2011
    I am going to try and answer the last 3 posts in one reply. Hopefully this is not too confusing.

    The /include one you ya'all see is interesting.. It is not in any of the sitemaps that we are generating - my guess is that there is something linking to it and Google is crawling it from there - but its not sitemap related. 403 is the right error code for this (aka "permission denied").

    The 404 on sitemap-index.xml.gz - remove that one if you can. We are include sitemap-index.xml from robots.txt. Feel free to add that in if you want, but its not required - especially for Google. Either one works, but for purposes of consistency if you want to manually add it, using the one that we are specifying in robots.txt probably makes the most sense.

    503 on images is a temporary error, Google will retry again later.

    The 503's on robots.txt would be concerning if they were more recent. Google refetches robots.txt pretty-much daily and it shows these errors for quite a while before they drop off. These appear to be from almost a month ago.

    I tried the URL with the redirect error and I'm not seeing the same thing. My guess it that was transient as well.

    Hopefully I covered everything - in summary this all looks reasonably okay - with the exception of the /include thing. We need to figure out what URL is linking to that, but its not anything in the sitemaps as far as I can tell.

    - Greg
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 26, 2011
    Thanks Greg,
    I guess we will just have to wait and see if the 403 /include error goes away.

    I am going to try and figure out whats linking to it - it is not right. And I am guessing that there is a log somewhere showing this and the http referrer that sent it there. It isn't hurting anything on the crawling/indexing side, but whatever is linking to it shouldn't be.

    - Greg
  • Bob_ABob_A Registered Users Posts: 92 Big grins
    edited May 27, 2011
  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 29, 2011
    Twoofy wrote: »
    [...]but let me ask you: if you had to trade the number of results indexed in Google's primary index for images, would that be preferable? I haven't done any research to see if this is even what might be going on, but I'm curious what your thoughts would be on that.

    - Greg
    Greg, I'm just getting caught up on this thread (sitemaps). I've got quite a bit of experience with sitemaps and SEO, etc... My experience is that indexed images can generate much more traffic to a web site than an indexed page does. My vote is to allow both! Let the Google Image Bot index as much as they want! :D
  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 29, 2011
    Twoofy wrote: »
    I would not say it is "the" issue, but if right-click protection is turned on there is no way that Google will be able to grab your images. I was surprised to discover this too, to be honest :)

    - Greg
    I'm quite sure that Google does not grab images. What they do is index the URL to the image itself and then they display those images in the search engine results page.

    For example, do a search for "miracle on ice team" (without quotes) and look at the urls displayed for the images. These images are not on a google server. It might look like they are because the link that goes to each image begins with "", but this is most likely done so that Google can record the click on the image before they take users to the location where the image lives.

    If you click on an image in a Google search page, you'll be taken to the web site where the image lives (the address bar will still start with "", but if you right click on the image itself, you'll see that it is located on the domain where it is hosted.

    There's really no reason for Google to grab images unless they were placing them on their servers. I personally don't think that enabling the right click setting has any impact on Google's ability to index an image url...
  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 29, 2011
    Andy wrote: »
    Site maps are automatically done by SmugMug, you need do nothing. There is more info on Google Image Search here: Read from here down

    Andy, honestly there is one more important piece for everyone to remember. If you really want Google to keep up with changes you make to your site (adding images, etc..), be sure to re-submit your sitemap after you've made major updates (added several images, new galleries, etc.. - once a day max.

    I don't think SmugMug is doing this automatically, (although they really should... for sites where changes like these are made).

    Not only will this help Google locate new images, urls, etc.. It also let's them know that you are constantly updating your web site. This can have a positive impact to search rankings.
  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 29, 2011
    jfriend wrote: »
    You put the appropriate meta tag in the head section of the advanced customization. Smugmug will then include that on every page on your site and Google will then believe that you own that domain.

    Yes, and everyone should do the same for Yahoo and Bing as well. Just do a search for Yahoo Webmaster Tools or Bing Webmaster Tools and you'll find the right place to do this.
  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 29, 2011
    Seems like my questions was missed around renaming images having no effect on the sitemap or the name of the image on the site... does anyone have any thoughts?

    When you change an image name, it breaks the link that Google has indexed for the original image URL. This is bad. Ideally, what should be done is a 301 redirect to the new URL for the image that was renamed. I don't think SmugMug does this, so the best option right now is to NOT rename your images.

    In Google's eyes, and image or page name that has been renamed is a broken link. Ideally, waht should be done is to create a 301 Redirect to the new URL. Google takes this information and updates their search index with it. Without a 301 redirect, Google sees the old URL as a broken link. They won't know you've renamed an image or page URL, so your broken link will sit there for a long time until Google removes it from their index.

    Your new image name will likely get indexed before the old one is removed by Google...
  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 29, 2011
    Twoofy wrote: »
    Hi there,

    I see the same data you are and checked our queue - I see your site in the queue to be processed. Right now there are a quite a few entries in the queue. [...]
    - Greg

    Does SmugMug re-submit those updated sitemaps to Google, Bing and Yahoo? Or are you only updating the sitemap for their next visit?
  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 29, 2011
    Bob_A wrote: »
    This is the error for both of them. Note that they've been working fine for almost a year.

    General HTTP error: 404 not found
    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.

    HTTP Error: 404
    Problem detected on: May 18, 2011
    Bob, also... occasionally Google has their own issues as well. I would resubmit your sitemap, wait a few days (preferably a week) and see if these clear up before taking any other action.
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 29, 2011
    anim8tr wrote: »

    Does SmugMug re-submit those updated sitemaps to Google, Bing and Yahoo? Or are you only updating the sitemap for their next visit?

    We do not resubmit them automatically, but the system is designed to work reasonably without it (though I do agree it is sometimes helpful to resubmit if things are taking too long or if you've made lots of changes). Google re-crawls the robots.txt daily (though that is unlikely to change). Referenced therein is the link to sitemap-index.xml, which it probably does crawl a little less frequently, though I really do not have any specific metrics on that it seems nearly as frequently as robots.txt. I imagine Google has pretty sophisticated algorithms to measure how frequently things change and may back off on that rate for sites that change rarely.

    Once Google has crawled the sitemap-index.xml everything is fine because in there we list each individual sitemap file with the last date something has changed within it (which is why you probably hear us saying so much to add/change/delete something about a gallery) - if its changed Google will crawl it nearly immediately and usually the pages within an hour or two. But, then it does take some time for that to work its way into Google's index.

    - Greg
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 29, 2011
    anim8tr wrote: »
    I'm quite sure that Google does not grab images. What they do is index the URL to the image itself and then they display those images in the search engine results page.

    Naah, I just took one of my servers down to test it. I was still able to right-click save the image from Google:,r:21,s:0&biw=1376&bih=844

    Since I see you are browsing the forums I will leave my server down for the next 20 or so minutes so you can see the difference. The "background page" (the one containing the image) does not show up - but that image is being served up from Google.

    Edit: I kinda want to bring my server back up, so here is the link to the image Google is serving up: is a Google owned domain name.

    Oh, I forgot one more thing that makes this problematic. Say you have a website that shows an image with a link pointing to a specific image in one of your galleries, a visitor to SmugMug would not be able to right-click save it (if you have enabled that option), but on your other site they would be able to. Even for cases where Google is showing an image linked to from a remote site it is still possible to right-click save it. Or at least it was, I was pleasantly surprised to see that Google has made that a bit more difficult with a recent update.

    Hope this makes sense, if not let me know and I'll elaborate more.

    - Greg
  • carolinecaroline Registered Users Posts: 1,302 Major grins
    edited May 30, 2011
    I keep returning to this thread to try and understand more about what is going on with sitemaps, but I'm not getting any clearer about ti.

    I have some questions:-

    Where do I find my site map?
    How can I or you check that everything is OK with it?
    How do I actually re-submit it?
    If there is something wrong what should I do?

    That's cutting it right down to the basics. I do understand that Smugmug periodically resubmits but gather it is desirable to do this more frequently.

    Mendip Blog - Blog from The Fog, life on the Mendips - Follow me on G+

  • richpepprichpepp Registered Users Posts: 360 Major grins
    edited May 30, 2011
    caroline wrote: »
    I have some questions:-

    I can't answer all of them but I'll tell you what I know (or believe that I know). Maybe a Smuggie can chip in with the rest

    >Where do I find my site map?

    It's at the very top of your site and is called sitemap-index.xml. So for you it is at

    If you open that up you will then see a few other files with a .gz extension. You can download those and open them with an unzipping program. However that is the difficult way to do it :)

    What most people do is to sign up with Google Webmaster Tools. They will then give you a special 'code' that they want you to add to your site e.g. UA-851248-9. You put that in your Control Panel, Settings, Google Analytics box. Google will then know that you own the site. Then, in Google Webmaster Tools, on the left hand menu you can click on 'Site Configuration' then 'Sitemaps' to see your sitemap. If you double click on the sitemap you will see the 'sub files' that I mentioned earlier. That page will also tell you how many of those images it has in its index. Be careful here though as I believe that number refers to web search and not image search but I may be wrong.

    >How can I or you check that everything is OK with it?

    In webmaster tools there will be a happy green tick mark beside the file if it is happy with it and a red X if they aren't

    >How do I actually re-submit it?

    In webmaster tools there is a 'resubmit' button beside each file. It also tells you the date that Google last read the file. The last time ours was read was the 29th of May. That doesn't mean the Smugmug resubmitted it then - just that the 29th was the late date that Google read that file as it checks them regularly.

    >If there is something wrong what should I do?

    I would check here first. The file is generated by Smugmug so if there is a problem then someone has probably already spotted it and it is being worked on. That is certainly what I have found

    Good luck

  • chipjchipj Registered Users Posts: 149 Major grins
    edited May 30, 2011
    caroline wrote: »
    I keep returning to this thread to try and understand more about what is going on with sitemaps, but I'm not getting any clearer about ti.

    I have some questions:-

    Where do I find my site map?
    How can I or you check that everything is OK with it?
    How do I actually re-submit it?
    If there is something wrong what should I do?

    That's cutting it right down to the basics. I do understand that Smugmug periodically resubmits but gather it is desirable to do this more frequently.



    I've seen this question asked before without an answer, so I'll just go ahead and help. Your sitemap file is located at:

    So if your smugmug domain name is, then your sitemap would be located at

    In order to check that it is working okay, you'd need to setup a webmaster account with Google, Yahoo, Bing. You can do this at the following locations:
    For Google:
    For Yahoo:
    For Bing:

    Once you setup your login and accounts, you can go to these three sites to see if there are issues with your sitemap. They may show issues immediately, but you have to give it a day or two to make sure.

    You can resubmit your sitemap to each search engine by logging into each site and then go to their respective sitemap submit tool.

    If there is something wrong I would imagine that you should contact SmugMug directly, although occasionally you will find that Google (for example) will display an error one day, but then it goes away within the next day or two. Google does not have a perfect system, but if an error shows up, you should really give it a day or two to see if it goes away before contacting SmugMug.
  • carolinecaroline Registered Users Posts: 1,302 Major grins
    edited May 30, 2011
    Hi Rich and Anim8tr,

    Looks like I already had a code in my control panel and I have a Google Webmaster account - had forgotten about this completelyheadscratch.gif

    However I have a big fat red X and the following comments:-

    Line Status Details error.png - General HTTP error: 404 not found
    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit. HTTP Error: 404

    Problem detected on: May 30, 2011
    error.png - Empty Sitemap
    Your Sitemap does not contain any URLs. Please validate and resubmit your Sitemap. Problem detected on: Jul 25, 2010

    What next guys?
    Thanks very much for your help:)

    Mendip Blog - Blog from The Fog, life on the Mendips - Follow me on G+

  • richpepprichpepp Registered Users Posts: 360 Major grins
    edited May 30, 2011
    caroline wrote: »
    However I have a big fat red X...

    Hi Caroline,

    Could you check something first please. When you click on Sitemaps on the left, the right hand half of the window will show your sitemaps. At the far right hand side of that window it will say 'Show Submissions: By me(_) - All(_)'. Can you click on 'All' and see what you get?

    The reason I ask is that the SmugMug file normally appears under the 'All' link because they submit the sitemap using the robots.txt file. Anything that you submitted last year will appear under the 'by me' link. I think something happened with the sitemaps at some point last year so the file that you submitted may no longer be available.

    If that was the problem then just delete the file in the 'By me' section - you don't need it. The one in the 'All' section is what you need.

    If it isn't that then I don't know - sorry

  • OnFirePhotographyOnFirePhotography Registered Users Posts: 71 Big grins
    edited May 30, 2011
    so does right click disabled affect google searching for pics, despite my hundreds of photos google only finds a couple when searching for my site.

    It also only returns links to my site to associated keywords and not the proper galleries? If i search for Onfirephotography in google it returns results such as this
  • carolinecaroline Registered Users Posts: 1,302 Major grins
    edited May 30, 2011
    Hi Rich,
    This is what I see
    richpepp wrote: »

    Thanks for looking, hope to hear from you.

    Mendip Blog - Blog from The Fog, life on the Mendips - Follow me on G+

  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited May 30, 2011
    caroline wrote: »
    Hi Rich,
    This is what I see
    You can (should) delete the sitemap.xml one.

    As for sitemap-index.xml, we need to go one level deeper: what do you see when you click sitemap-index.xml? You can either paste a screenshow or just tell post what the error and the date is.

    - Greg
  • carolinecaroline Registered Users Posts: 1,302 Major grins
    edited May 30, 2011
    You can (should) delete the sitemap.xml one - Done

    As for sitemap-index.xml, we need to go one level deeper: what do you see when you click sitemap-index.xml? You can either paste a screenshow or just tell post what the error and the date is.

    - Greg[/QUOTE]


    Thanks for taking a look Greg.

    Mendip Blog - Blog from The Fog, life on the Mendips - Follow me on G+

Sign In or Register to comment.