Got a sitemap question? Look here.

AndyAndy Registered Users Posts: 50,016 Major grins
edited November 11, 2016 in SmugMug Support
Hi folks. While we're working on adjustments to how Google (and other search engines) crawl SmugMug sites, it's entirely possible you will see crawl errors in your Google Webmaster tools. We are removing some URLs from the sitemaps that were not getting indexed or really should never have been in them in the first place. The last thing we want to do is drive traffic to irrelevant pages that do not showcase everyone's beautiful photography. These errors should go away in the coming weeks.

Everyone's patience is appreciated while we do this.

NORMAL (awesome) indexing of your SmugMug sites (and great SEO) is NOT AFFECTED by this. We continue to place an extremely high value and importance on your sites and galleries getting found by Google and other search engines. Please be sure to do everything we suggest here, if you've not already done so http://smugmug.com/help/search-engines

NOTE ON GOOGLE IMAGE SEARCH Google Image Search relies heavily on finding images that are linked elsewhere (your blog, other websites, Dgrin, etc.). We are constantly talking to and working with Google to get better results of SmugMug images in GIS.

Got questions? Ask 'em here.
«13456711

Comments

  • jachangjachang Registered Users Posts: 183 Major grins
    edited February 9, 2011
    Andy,

    What about the red X's on the Webmaster Tools site page? I still have the big red X on there.

    Jean
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited February 9, 2011
    jachang wrote: »
    Andy,

    What about the red X's on the Webmaster Tools site page? I still have the big red X on there.

    Jean

    Can I see what you're talking about please - make a grab and post it?
  • jachangjachang Registered Users Posts: 183 Major grins
    edited February 9, 2011
    Andy wrote: »
    Can I see what you're talking about please - make a grab and post it?

    Right after I had sent my message here, I resubmitted my sitemaps, so one still shows pending, but it had the same red X as the images sitemap. Here is the screenshot of the sitemap page. I'll put the error screenshot in the next message. I can only attach one at a time, I guess.

    Thanks,

    Jean
  • jachangjachang Registered Users Posts: 183 Major grins
    edited February 9, 2011
    And here is the error message
    jachang wrote: »
    Right after I had sent my message here, I resubmitted my sitemaps, so one still shows pending, but it had the same red X as the images sitemap. Here is the screenshot of the sitemap page. I'll put the error screenshot in the next message. I can only attach one at a time, I guess.

    Thanks,

    Jean

    Screenshot of the error message:

    Also, I noticed that others here said that their crawl stats dropped down to zero on January 31. That's when mine fell to zero also.
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited February 9, 2011
    stay tuned Jean while we make changes to sitemaps. These errors will go away.
  • OffTopicOffTopic Registered Users Posts: 521 Major grins
    edited February 10, 2011
    Andy wrote: »

    NORMAL (awesome) indexing of your SmugMug sites (and great SEO) is NOT AFFECTED by this.

    I love you guys, have been a happy pro member for several years. I'm not complaining just concerned and just want to make sure you are aware of how bad the problem is getting on our end, that's all. iloveyou.gif

    As of this morning I now have 100,000 url's restricted by robots.txt. I didn't worry about that at first as I keep seeing the posts that these are urls that we don't want crawled...until I realized the other day that my site is barely being crawled again!

    1183349221_8G47H-L.png

    How can I look at that and not be concerned? It's almost as bad as it was in December. I know I should have more pages than that being crawled.

    I learned years ago that I get 99% of my 'people who found me' via search engines traffic from my blog so I make sure to blog any images/galleries that I want found and it works very well. But I still want my website crawled.
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited February 10, 2011
    OffTopic wrote: »
    I love you guys, have been a happy pro member for several years. I'm not complaining just concerned and just want to make sure you are aware of how bad the problem is getting on our end, that's all. iloveyou.gif

    As of this morning I now have 100,000 url's restricted by robots.txt. I didn't worry about that at first as I keep seeing the posts that these are urls that we don't want crawled...until I realized the other day that my site is barely being crawled again!

    1183349221_8G47H-L.png

    How can I look at that and not be concerned? It's almost as bad as it was in December. I know I should have more pages than that being crawled.

    I learned years ago that I get 99% of my 'people who found me' via search engines traffic from my blog so I make sure to blog any images/galleries that I want found and it works very well. But I still want my website crawled.

    OffTopic,

    To be honest, your posts here are some of my favorites - as an engineer you always provide very detailed information that I can actually investigate. I did a cursory look at your sitemaps and they are correct/expected as scary as those numbers seem.

    On another post, someone had a very interesting way of looking at this: what we want to have is a single URL that Google indexes for each of your photos, not 5,000 links all pointing to the same place. This can be seen as "spam" by many search engines and actually lower your rank.

    The red X's in Webmaster Tools is just Google being out-of-date with the changes. I use webmaster tools for all of my sites and have yet to see it report very accurate information.
    One of the changes that you will probably see is that the crawl rate will go down, because there are actually less spam pages for Google to crawl - and we are better able to inform Google when the pages we want indexed have actually changed. I am a bit surprised that it has dropped for you so suddenly though - I will look into it.

    Someone (and I apologize, I'm not sure if it was you, as I am typing this on my Wii), mentioned January 31st. That is a known blip where we had to throttle back the bots and temporarily give them 503 (try again later) results on some pages - they were heavily impacting site usability that day.
    Many of these changes will have dramatic improvements for much of the rest of the site too. As bot traffic is very unpredictable and can randomly spike through the roof.

    Thanks again for your thoughtful posts! I will post here again in a day or two after I've kept an eye on your site to make sure everything is happening the way we expect it to.

    - Greg
  • jachangjachang Registered Users Posts: 183 Major grins
    edited February 10, 2011
    Not crawled since January 31st
    Twoofy wrote: »
    OffTopic,

    To be honest, your posts here are some of my favorites - as an engineer you always provide very detailed information that I can actually investigate. I did a cursory look at your sitemaps and they are correct/expected as scary as those numbers seem.

    On another post, someone had a very interesting way of looking at this: what we want to have is a single URL that Google indexes for each of your photos, not 5,000 links all pointing to the same place. This can be seen as "spam" by many search engines and actually lower your rank.

    The red X's in Webmaster Tools is just Google being out-of-date with the changes. I use webmaster tools for all of my sites and have yet to see it report very accurate information.
    One of the changes that you will probably see is that the crawl rate will go down, because there are actually less spam pages for Google to crawl - and we are better able to inform Google when the pages we want indexed have actually changed. I am a bit surprised that it has dropped for you so suddenly though - I will look into it.

    Someone (and I apologize, I'm not sure if it was you, as I am typing this on my Wii), mentioned January 31st. That is a known blip where we had to throttle back the bots and temporarily give them 503 (try again later) results on some pages - they were heavily impacting site usability that day.
    Many of these changes will have dramatic improvements for much of the rest of the site too. As bot traffic is very unpredictable and can randomly spike through the roof.

    Thanks again for your thoughtful posts! I will post here again in a day or two after I've kept an eye on your site to make sure everything is happening the way we expect it to.

    - Greg

    Greg,

    I think I'm the person you were referring to who mentioned January 31st. See my post above. Exact same problem. My crawls have not increased since January 31st. They're staying close to zero!

    I know web search and image search are two different animals, but it seems like before it was only image search that was broken--now both web and image search have problems! We keep being told to "link our images to our blogs in order to get found by image search engines." I've been told that none of my images come up in Google image search because, "Google doesn't find them interesting," or some other excuse. Well, if they're not interesting to Google, why would they be found on our blogs any better than on SmugMug?? In fact, the images on my blog were ALL found with a matter of days, and if I do an image search, I find them. Same thing with Flickr images. It's only SmugMug that isn't working, and the really sad part is--the others are free! I'm paying not to be found on SmugMug. I've been a loyal SmugMug Pro user for about three years, and nothing has changed since day 1 as far as being found by image search.

    Thankfully, web search works great--I come up on page one. But now that everything seems to have died on January 31st I'm wondering if I'll continue to be found at all.

    Okay, I got my rant off my chest. I love SmugMug and all the folks on Digital Grin--really I do--it's just so frustrating to be told "Wait...wait...wait" "We're working on it..." I know you ARE working on it, but after three years of hearing this, it's starting to get a little old.

    Thanks for listening, and if you can check out why I haven't been crawled since January 31st, please let me know.

    Love, (Yes, I still love SmugMug)iloveyou.gif

    Jean
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited February 10, 2011
    OffTopic wrote: »

    How can I look at that and not be concerned?

    Because I specifically told you and all customers not to be deal.gif
    We're working on having google only crawl the most important stuff. Please re-read my first post, thanks!
  • cmancman Registered Users Posts: 75 Big grins
    edited February 10, 2011
    File robots.txt:
    Disallow: /date/
    Disallow: /hack/
    Disallow: /keyword/
    Disallow: /popular/

    Please give users the opportunity (trough advanced customizer) to decide - what to index, and what not to index.
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited February 10, 2011
    jachang wrote: »
    Greg,

    I think I'm the person you were referring to who mentioned January 31st. See my post above. Exact same problem. My crawls have not increased since January 31st. They're staying close to zero!

    I know web search and image search are two different animals, but it seems like before it was only image search that was broken--now both web and image search have problems! We keep being told to "link our images to our blogs in order to get found by image search engines." I've been told that none of my images come up in Google image search because, "Google doesn't find them interesting," or some other excuse. Well, if they're not interesting to Google, why would they be found on our blogs any better than on SmugMug?? In fact, the images on my blog were ALL found with a matter of days, and if I do an image search, I find them. Same thing with Flickr images. It's only SmugMug that isn't working, and the really sad part is--the others are free! I'm paying not to be found on SmugMug. I've been a loyal SmugMug Pro user for about three years, and nothing has changed since day 1 as far as being found by image search.

    Thankfully, web search works great--I come up on page one. But now that everything seems to have died on January 31st I'm wondering if I'll continue to be found at all.

    Okay, I got my rant off my chest. I love SmugMug and all the folks on Digital Grin--really I do--it's just so frustrating to be told "Wait...wait...wait" "We're working on it..." I know you ARE working on it, but after three years of hearing this, it's starting to get a little old.

    Thanks for listening, and if you can check out why I haven't been crawled since January 31st, please let me know.

    Love, (Yes, I still love SmugMug)iloveyou.gif

    Jean

    Hello Jean,

    Can you PM me some examples of the image search issue? Pretend like I'm a 6-year old, walk me through step-by-step what you are expecting to see and not seeing.

    - Greg
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited February 10, 2011
    cman wrote: »
    File robots.txt:


    Please give users the opportunity (trough advanced customizer) to decide - what to index, and what not to index.

    I'm sorry, we cannot do that :( I wish I had a better, different answer for you!
  • cmancman Registered Users Posts: 75 Big grins
    edited February 10, 2011
    Rather than allow indexing, You banned everything that can be indexed.
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited February 10, 2011
    cman wrote: »
    Rather than allow indexing, You banned everything that can be indexed.

    cman,

    What we are blocking is:


    Disallow: /admin/
    Disallow: /cart/
    Disallow: /date/
    Disallow: /hack/
    Disallow: /help/
    Disallow: /keyword/
    Disallow: /popular/
    Disallow: /search/
    Disallow: /test/
    Disallow: /VIP/
    Disallow: /vip/

    Everything else is allowed.

    - Greg
  • denisegoldbergdenisegoldberg Administrators Posts: 14,383 moderator
    edited February 10, 2011
    Twoofy wrote: »
    What we are blocking is:
    ....
    Disallow: /keyword/
    I suspect I may be missing something obvious, but I'm a bit confused by keyword being blocked. Doesn't that imply that the keywords I have placed on my images won't be seen in search? Or does that mean that the keyword pages themselves are blocked but that the keywords on my images are not?

    --- Denise
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited February 11, 2011
    I suspect I may be missing something obvious, but I'm a bit confused by keyword being blocked. Doesn't that imply that the keywords I have placed on my images won't be seen in search? Or does that mean that the keyword pages themselves are blocked but that the keywords on my images are not?

    --- Denise

    Your keywords appear in the <meta> tags for the image - as well as on the page where the photo is displayed. The /keywords path is an alternative way to navigate your images, but it does not help with SEO - in fact it actually creates confusion because many of those pages end up being re-directs to other pages or send them on a nearly endless crawling expedition as they try to navigate through a ton of "virtual pages". So, while it may be a nice way for a human-user to browse through your content, it is not at all ideal for a web crawler to take that path.

    Here is the literal URL that we are blocking: http://www.denisegoldberg.com/keyword/

    As you can see there are thousands of links on that page and clicking any of them will take you to another page where additional keywords can be added or deleted in nearly endless combinations. That means the bots are spending all this time crawling these essentially spam links and not focusing on the primary photography pages.

    Hope this helps!

    - Greg
  • cmancman Registered Users Posts: 75 Big grins
    edited February 11, 2011
    Topic title should have a name "serious problems indexing of sites", and not "got a sitemap question ..." (too simple and uninteresting). Thus, you artificially narrowed the existing problem of indexing.
    ... Or does that mean that the keyword pages themselves are blocked but that the keywords on my images are not? ...

    Now all keywords on all pages are blocked.
  • denisegoldbergdenisegoldberg Administrators Posts: 14,383 moderator
    edited February 11, 2011
    Twoofy wrote: »
    Here is the literal URL that we are blocking: http://www.denisegoldberg.com/keyword/

    As you can see there are thousands of links on that page and clicking any of them will take you to another page where additional keywords can be added or deleted in nearly endless combinations....

    Hope this helps!
    Thanks Greg - that helps a lot. It may be silly, but I like to understand...

    --- Denise
  • TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited February 12, 2011
    Thanks Greg - that helps a lot. It may be silly, but I like to understand...

    --- Denise

    It would be incredibly hypocritical of me to use the word "silly" in describing anyone else's personality traits :)

    I hope this has helped you get a better understanding of the problems and what we are trying to fix.

    - Greg
  • OffTopicOffTopic Registered Users Posts: 521 Major grins
    edited February 12, 2011
    Twoofy wrote: »
    OffTopic,

    To be honest, your posts here are some of my favorites - as an engineer you always provide very detailed information that I can actually investigate.

    awww, thanks Greg! iloveyou.gif
    I do try...since I spent most of my life being the one who has to fix problems, I know how important it is to have accurate and specific info.
    Andy wrote: »
    Because I specifically told you and all customers not to be deal.gif
    :giggle we need a smiley sticking it's tongue out! ...love for you too Andy iloveyou.gif

    I have a green check mark on my site map today, thank you! :D
  • djkraandjkraan Registered Users Posts: 45 Big grins
    edited February 20, 2011
    /sitemap-galleries.xml.gz
    /sitemap-index.xml
    /sitemap-base.xml.gz
    /sitemap-images.xml.gz

    All is with green checkmarks now. Thank you smugmug for support in this. :)
  • CFPhotographyCFPhotography Registered Users Posts: 83 Big grins
    edited February 23, 2011
    So good news and bad news...First the good news:
    All my sitemaps are green!

    /sitemap-galleries.xml.gz
    /sitemap-index.xml
    /sitemap-base.xml.gz
    /sitemap-images.xml.gz

    The bad news, ever since you guys have started this whole site map change a couple of weeks ago, my google impressions in search engines has dropped by about %80! Not increased but dropped and yes I said by %80!! This is not a good thing! headscratch.gif
  • Luc De JaegerLuc De Jaeger Registered Users Posts: 139 Major grins
    edited February 24, 2011
    So good news and bad news...First the good news:
    All my sitemaps are green!

    /sitemap-galleries.xml.gz
    /sitemap-index.xml
    /sitemap-base.xml.gz
    /sitemap-images.xml.gz

    The bad news, ever since you guys have started this whole site map change a couple of weeks ago, my google impressions in search engines has dropped by about %80! Not increased but dropped and yes I said by %80!! This is not a good thing! headscratch.gif

    I'm no SmugMug employee, nor an affiliate or whatever... and just responding unsollicited to your concern because of my involvement in other e-commerce sites and e-marketing etc.

    Most people don't understand the difference between sitemaps and SEO. We are all photographers, and not web site managers/analysts or programmers and thus quite ICT illiterate. Most of us can't know it all and will have to rely on other ICT experts which SmugMug should (and will) have.

    Sitemaps have nothing to do with SEO at all! Some time ago, Google changed it's SE algorithms resulting in NUMEROUS webmasters screaming out loud that Google has dropped their sites etc.

    We're NOT living in a static world. Millions of new web sites are spidered daily and algorithms have to be modified now and then to counter-act spam, make sure new web sites are incorporated in the search databases and reviewed on their content etc...

    The internet is not only a minefield or a battlezone but also a wave of ever changing and/or updated SE algorithms...

    Sitemaps have nothing to do with it at all. When my site management of one of my e-commerce web sites introduced the sitemaps automatically, NUMEROUS people reported a drop in SE ranking and droppings from the SE. They also thought (erroneously) that the droppings had to do with the sitemaps.

    Do know that every Search Engine (SE) changes its algorithms quite frequently (and at times very profoundly which Google has done again recently) to make sure all the daily new content is weighted and included in the databases so that the most relevant information gets on top (ok, that's the theory).

    While this is a fully automated process, failures DO happen!

    The golden rule that the site management of one of my e-commerce web sites propagates is to BE REAL and stick to the basic principles (SE is still about text so, do make sure you have captions, keywords etc.... below your photos) and your ranking might be in balance in some time again.

    Luc
  • denisegoldbergdenisegoldberg Administrators Posts: 14,383 moderator
    edited March 3, 2011
    /sitemap-images.xml.gz has flipped from green to red again on my site:
    General HTTP error: 404 not found
    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.
    I resubmitted the sitemap, no change.

    Andy's note above telling us to expect errors was a month ago - is this still expected? Or is this broken?

    --- Denise
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited March 3, 2011
    /sitemap-images.xml.gz has flipped from green to red again on my site:
    I resubmitted the sitemap, no change.

    Andy's note above telling us to expect errors was a month ago - is this still expected? Or is this broken?

    --- Denise
    mine are all green, with fresh dates (yesterday and today). try again a bit later today?
  • denisegoldbergdenisegoldberg Administrators Posts: 14,383 moderator
    edited March 3, 2011
    Andy wrote: »
    mine are all green, with fresh dates (yesterday and today). try again a bit later today?
    The red started yesterday, and it's only on the images section. I resubmitted this morning, no change.

    I'll check again later and post an update.

    --- Denise
  • Hikin' MikeHikin' Mike Registered Users Posts: 5,467 Major grins
    edited March 3, 2011
    Mine are all green as of this post, but I still have NO images ( /sitemap-images.xml.gz) in the web index. :cry
  • denisegoldbergdenisegoldberg Administrators Posts: 14,383 moderator
    edited March 4, 2011
    Andy wrote: »
    mine are all green, with fresh dates (yesterday and today). try again a bit later today?
    Still showing red on /sitemap-images.xml.gz.

    As the previous poster indicated, this entry also shows with 0 URLs in web index, which I find quite odd.

    --- Denise
  • AndyAndy Registered Users Posts: 50,016 Major grins
    edited March 4, 2011
    Greg will look, stay tuned.
  • denisegoldbergdenisegoldberg Administrators Posts: 14,383 moderator
    edited March 4, 2011
    Andy wrote: »
    Greg will look, stay tuned.
    Thanks Andy.

    --- Denise
Sign In or Register to comment.