Options

Got a sitemap question? Look here.

2456711

Comments

  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 4, 2011
    Still showing red on /sitemap-images.xml.gz.

    As the previous poster indicated, this entry also shows with 0 URLs in web index, which I find quite odd.

    --- Denise

    When you click it what is the error and date that it is giving you?

    Thanks!

    - Greg
  • Options
    denisegoldbergdenisegoldberg Administrators Posts: 14,240 moderator
    edited March 4, 2011
    Twoofy wrote: »
    When you click it what is the error and date that it is giving you?
    General HTTP error: 404 not found
    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.

    HTTP Error: 404
    Problem detected on: Mar 3, 2011

    I did resubmit, but the error remains.

    --- Denise
  • Options
    Hikin' MikeHikin' Mike Registered Users Posts: 5,455 Major grins
    edited March 4, 2011
    Same here Mike, all green but no image urls listed for (/sitemap-images.xml.gz).
    Does this mean that google is not seeing our images?
    I still notice none from my site show up on an image search unless they were internally linked.
    Many show from the forums where I link my images.
    It seems like this would be a priority for an image hosting site!



    I'm going to assume that Smugmug did their job. I see 210 images that were submitted to Google. Now I have to wait on Google to actually index them.
  • Options
    Hikin' MikeHikin' Mike Registered Users Posts: 5,455 Major grins
    edited March 4, 2011
    It's been a few months for me. Maybe it takes longer to index images then regular pages? Just guessing.

    Any comment from Smugmug?
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 4, 2011
    General HTTP error: 404 not found
    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.

    HTTP Error: 404
    Problem detected on: Mar 3, 2011

    I did resubmit, but the error remains.

    --- Denise

    Hmm.. What URL is it (sorry, should have asked that first), I just tried all of your sitemaps files specified in the sitemap-index and they are fetching just fine.

    Also, just to be sure: you only added sitemap-index to Google, right? That is the only one that should be added, Google will find the other ones from there.

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 4, 2011
    I see 302 images submitted, but the zero on the web index side has been there since day one for me.

    I am curious as to how long it should take google to place the submitted images into the web index.
    We are talking months, at least 4 for me.

    NOTE:
    I just went and looked at the old robot.txt thread about sitemaps and apparently everyone has a zero listed in the web index column for the images sitemap. Can anyone please explain why?

    One of my websites that has been around for about 4 years has this:

    Submitted URLs 13,190,414
    0 URLs in web index

    .. I don't think those numbers are very accurate, to say the least.

    When I do an actual Google search for your domain it somewhere around 4,200 pages. http://www.google.com/search?client=safari&rls=en&q=site:shawnkrausphoto.com&ie=UTF-8&oe=UTF-8

    - Greg
  • Options
    denisegoldbergdenisegoldberg Administrators Posts: 14,240 moderator
    edited March 5, 2011
    Twoofy wrote: »
    Hmm.. What URL is it (sorry, should have asked that first), I just tried all of your sitemaps files specified in the sitemap-index and they are fetching just fine.

    Also, just to be sure: you only added sitemap-index to Google, right? That is the only one that should be added, Google will find the other ones from there.

    - Greg
    Yes, I only loaded sitemap-index. The other entries were added by Google. All entries were showing green for weeks, then the "images" entry changed from a green checkmark to a red x this week.

    The url is http://www.denisegoldberg.com.
    The entry showing an error is /sitemap-images.xml.gz.

    --- Denise
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 5, 2011
    I realize when I do a site search I end up with 4210 domain hits, what bothers me is that when I then click the image search on the left only a handful of my images show up and generally they are from a forum or blog posts, not SmugMug.

    I assume that is because the image urls are not getting placed in the web index.

    It appears that they are submitted, but for some reason it looks as though Google is just ignoring them.

    When I was on Pbase, my images would show up in a Google image search within the week, and not just linked images, actual gallery images. That is why I am so frustrated with the search results for Smugmug images.

    My apologies I misunderstood what you meant. I definitely know that this is a source of great pain. I'm in the process of wrapping up a major project, and as soon as I'm done with this I'm turning my attention to feeds and sitemaps. Although why they are not showing up in Google Images is not sitemap related, its still part of the broader SEO-related subject.

    Hopefully it won't come to this, but let me ask you: if you had to trade the number of results indexed in Google's primary index for images, would that be preferable? I haven't done any research to see if this is even what might be going on, but I'm curious what your thoughts would be on that.

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 5, 2011
    Yes, I only loaded sitemap-index. The other entries were added by Google. All entries were showing green for weeks, then the "images" entry changed from a green checkmark to a red x this week.

    The url is http://www.denisegoldberg.com.
    The entry showing an error is /sitemap-images.xml.gz.

    --- Denise

    Ok, I see it now. Let me look into it and I'll get back to you.

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 5, 2011
    First off, let me thank you for all the attention you have given sitemaps so far.
    I am still trying to gain a better understanding of them myself.
    I have always assumed that a sitemap just tells Google what to look at on a site and that Google then uses that information to publish the results of what it finds to its listings.
    It is obviously a lot more complex than that.

    Now for to answer your question.(A tough but interesting question by the way)

    I honestly would like the best of both worlds.
    If I ultimately had to choose I would have to choose primary index over image index. It would be almost impossible without direct link or primary index for people to find me and therefor I would choose the primary index.

    I aaume you might have far more insight on this matter and that might be why you pose that particular question.

    Thanks so much for the answer and I am almost sure I will have more questions in the future, but I am content for the moment knowing SmugMug intends to look into it more!

    Many Thanks !

    It is truly my pleasure to help out where I can - only wish there were more hours in the day. But knowing that it is appreciated helps a lot and is very motivating, let me tell ya' :)

    Sitemaps are easy to get confused about, but all they really do is provide a set of links for search engines to find pages. If a site's URLs are organized the right way a search engine can easily crawl through the pages without them. I would definitely classify SmugMug as one of those sites that are "easy to crawl" - they were around before the sitemap protocol was created in 2005. So they had to do things the hard way and make sure that their URLs were efficient and easy for robots to crawl and index.

    With large sites, like SmugMug, sitemaps offer another advantage in that we can include in the sitemap details about when a specific page has been modified, deleted, or added. The benefit to this is that a properly designed crawler only has to re-crawl those pages and can just "ping" the other pages to check if they are still active.

    At the end of it though, for SmugMug (especially when using custom domains like you are doing), sitemaps are advantageous from the perspective of being able to give robots better hints about how to more efficiently crawl our pages. But they do not impact SEO or when/how/if a search engine will index a page. Those decisions are made the same regardless of how the crawler came across the page.

    For other sites that do not have URLs that are so easy to navigate and crawl, sitemaps can offer some tremendous advantages because, believe it or not, some people actually design websites in a way that there are pages that cannot be navigated to without going through a form or something like that. For those sites the sitemaps can provide a link to those pages so that robots can find the content. This is just a non-issue for SmugMug sites.

    .. Add on top of this that webmaster tools is so chronically out-of-date with the data it presents and its amazing to me that there isn't even more confusion...

    As for my question, I'm hoping that no trade-off will have to be made here of course. I asked the question because I was very curious about which one was more important so that we are working on improving things in the right prioritized order. Good news is, it seems like we are all in alignment on this.

    - Greg
  • Options
    juliankjuliank Registered Users Posts: 43 Big grins
    edited March 6, 2011
    I am just about to add the sitemap (sitemap-index.xml - I hope I have this correct) and I need to go through the ownership verification steps. Which one is the recommended method? Is it the first one, add DNS record, or other like adding the meta tag on the homepage?

    ---
    Update: added the head meta tag using the code from Google.
  • Options
    juliankjuliank Registered Users Posts: 43 Big grins
    edited March 6, 2011
    Do we need to add all of these? I think sitemap-index.xml covers everything. Right?

    /sitemap-galleries.xml.gz
    /sitemap-index.xml
    /sitemap-base.xml.gz
    /sitemap-images.xml.gz
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 6, 2011
    juliank wrote: »
    Do we need to add all of these? I think sitemap-index.xml covers everything. Right?

    /sitemap-galleries.xml.gz
    /sitemap-index.xml
    /sitemap-base.xml.gz
    /sitemap-images.xml.gz

    Just add the sitemap-index and as for what method you use, any of them that are available will work. I think most people use the meta tag method, but could be wrong about that..

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 16, 2011
    Yes, I only loaded sitemap-index. The other entries were added by Google. All entries were showing green for weeks, then the "images" entry changed from a green checkmark to a red x this week.

    The url is http://www.denisegoldberg.com.
    The entry showing an error is /sitemap-images.xml.gz.

    --- Denise

    Hi Denise,

    I've been monitoring this over the past week or so and I *think* what happened is that we changed the number of files in sitemap-images. In your case we went from a single sitemap-imags.xml.gz to multiple ones:

    http://www.denisegoldberg.com/sitemap-images-0.xml.gz
    http://www.denisegoldberg.com/sitemap-images-1.xml.gz
    http://www.denisegoldberg.com/sitemap-images-2.xml.gz

    If I'm right, then what happened is temporary and you should soon (if you aren't already) be showing 3 files instead of 1 and these errors should disappear.

    Does this seem consistent with what you are seeing in webmaster tools now?

    - Greg
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 16, 2011
    I notice that my webmaster tools have not updated my site performance since February 20th.

    I found the following statement from the webmaster site:

    "Page Speed suggestions as shown in Site Performance are based on the version of your page as seen by Googlebot, Google's crawler. For various reasons—for example, if your robots.txt file blocks Googlebot from crawling CSS or other embedded content— these may differ slightly from the suggestions you get when you run the Page Speed extension for Firefox."
    -Google Webmaster Tools

    I was just wondering if the sitemap is blocking the googlebot from crawling the CSS and preventing the site performance stats from updating or if this is totally unrelated?

    Thanks in advance,
    --Shawn

    I honestly do not have an exact answer to this one. As far as I know we've not changed how we block css files from being crawled, certainly nothing around Feb 20th that comes to mind. Lets give it a few more weeks, it may just be slow to update the stats.

    Sorry I don't have a better answer for you on this one at the moment.

    - Greg
  • Options
    denisegoldbergdenisegoldberg Administrators Posts: 14,240 moderator
    edited March 16, 2011
    Twoofy wrote: »
    Hi Denise,

    I've been monitoring this over the past week or so and I *think* what happened is that we changed the number of files in sitemap-images. In your case we went from a single sitemap-imags.xml.gz to multiple ones:

    http://www.denisegoldberg.com/sitemap-images-0.xml.gz
    http://www.denisegoldberg.com/sitemap-images-1.xml.gz
    http://www.denisegoldberg.com/sitemap-images-2.xml.gz

    If I'm right, then what happened is temporary and you should soon (if you aren't already) be showing 3 files instead of 1 and these errors should disappear.

    Does this seem consistent with what you are seeing in webmaster tools now?

    - Greg
    At the top level I'm still seeing /sitemap-images.xml.gz with a red x next to it.
    If I drill down on the top-level entry /sitemap-index.xml I do see the three entries that you list above.

    So are you saying that the "bad" entry at the top level is essentially replaced with the 3 entries on the lower level, and that you expect the red-x'ed entry to go away?

    I'll keep an eye on it to keep watching for a change.

    Thanks.

    --- Denise
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 17, 2011
    At the top level I'm still seeing /sitemap-images.xml.gz with a red x next to it.
    If I drill down on the top-level entry /sitemap-index.xml I do see the three entries that you list above.

    So are you saying that the "bad" entry at the top level is essentially replaced with the 3 entries on the lower level, and that you expect the red-x'ed entry to go away?

    I'll keep an eye on it to keep watching for a change.

    Thanks.

    --- Denise

    Yep, thats exactly what is happening and I would expect the red X to disappear because that 1 file is now broken up into 3 smaller ones.

    - Greg
  • Options
    EntropicTendenciesEntropicTendencies Registered Users Posts: 84 Big grins
    edited March 20, 2011
    what should we expect webmaster tools to report for the sitemaps? I've got 3279 urls submitted, 220 in web index which just doesn't appear optimal...

    Thanks,

    Barrie
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 21, 2011
    what should we expect webmaster tools to report for the sitemaps? I've got 3279 urls submitted, 220 in web index which just doesn't appear optimal...

    Thanks,

    Barrie

    Hi Barrie,

    I've asked that question to Google before and never really got an answer. I can tell you though that I've never seen that figure reflect an accurate number. In your case, your site has 3,300 pages indexed: http://www.google.com/search?client=safari&rls=en&q=site:entropictendencies.org&ie=UTF-8&oe=UTF-8

    I've got other sites with 200,000+ pages and Webmaster Tools says there are a few thousand (and has for 3 years).

    - Greg
  • Options
    eyeforimageseyeforimages Registered Users Posts: 29 Big grins
    edited March 21, 2011
    Improving Google Indexing - Sitemap Issues and Lightroom Publishing
    Hi All,

    I recently started giving my images much better titles in the hope that google indexing is going to correctly index them as google images seems to completely ignore the majority of my work. I publish my images using Lightroom and went through renaming files from a date format to equal the title I have given the file. I resynched the catalog to smugmug, all appeared good.

    On checking webmaster tools this morning I got a whole load of 301 redirect errors from the crawl. Will the sitemap correct itself or am I going to have to drop the gallery and republish all of the images?

    URLs not followed
    When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.
    HTTP Error: 301
    URL:
    http://www.eyeforimages.com/Portfolio/Scapes/20101107-111049-RELEASE/1220254564_AY7Vc-S.jpg
    Problem detected on: Mar 20, 2011

    Thanks for Your help.
    Paul Stoakes
    Eye For Images
    Site: http://www.eyeforimages.com
    Blog: http://blog.eyeforimages.com
  • Options
    eyeforimageseyeforimages Registered Users Posts: 29 Big grins
    edited March 21, 2011
    I've done some more digging and checking on the sitemap for images I've got a bit of a tag on question.

    I dropped an image from the gallery and added it again (Sunset on the Gower.jpeg), left it for 40 minutes (random amount of time). When I check the sitemap for images I get the following data returned:

    http://www.eyeforimages.com/Portfolio/Scapes/20060710-211011-Sunset-on-The/1101801011_63K7F-S-14.jpg
    2011-03-20
    image/jpeg
    EN

    http://www.eyeforimages.com/Portfolio/Scapes/14776548_Sjqe9/7/1101801011_63K7F
    yes
    eye%20for%20images
    eye%20for%20images%20photography
    paul%20stoakes
    sunset
    the%20gower
    wales
    catflickr
    catportfolio
    catscapes
    eyeforimages
    misty
    tree
    www.eyeforimages.com
    Sunset on The Gower - Eye For Images Photography - Paul Stoakes
    2010-11-23
    400x267




    I guess the data loc will be updated in time, but what amount of time is it?

    Also, the naming scheme for images, one of the things documented to improve the search ability of your images is to always use a file name with words instead of a random selection of keywords. Even though I set a filename, the actual name is used in the folder structure and not for the filename. How does google indexing treat this data as it is the filename that is important to google?
    Paul Stoakes
    Eye For Images
    Site: http://www.eyeforimages.com
    Blog: http://blog.eyeforimages.com
  • Options
    eyeforimageseyeforimages Registered Users Posts: 29 Big grins
    edited March 21, 2011
    So here's another question... just to add to the existing ones... :)

    If ever any of my images actually get indexed... Is it possible to customise the sitemap so that it uses the L or XL versions of the images? From what I have read, google actually favours larger images and knowing the way I search images, larger images always appeal to me in google search.

    In addition, the landing page link, would it not be best to link through to the image in the light box rather than through to a page where numerous images appear? Especially in a large gallery it makes more sense to show the image immediately rather than have the user trawl through images to find the one they were coming in for?

    So, in the example above, rather than going to:
    http://www.eyeforimages.com/Portfolio/Scapes/14776548_Sjqe9/7/1101801011_63K7F

    go to this instead?: http://www.eyeforimages.com/Portfolio/Scapes/14776548_Sjqe9/7/1101801011_63K7F#1101801011_63K7F-A-LB
    Paul Stoakes
    Eye For Images
    Site: http://www.eyeforimages.com
    Blog: http://blog.eyeforimages.com
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited March 21, 2011
    So here's another question... just to add to the existing ones... :)

    If ever any of my images actually get indexed... Is it possible to customise the sitemap so that it uses the L or XL versions of the images? From what I have read, google actually favours larger images and knowing the way I search images, larger images always appeal to me in google search.

    In addition, the landing page link, would it not be best to link through to the image in the light box rather than through to a page where numerous images appear? Especially in a large gallery it makes more sense to show the image immediately rather than have the user trawl through images to find the one they were coming in for?

    So, in the example above, rather than going to:
    http://www.eyeforimages.com/Portfolio/Scapes/14776548_Sjqe9/7/1101801011_63K7F

    go to this instead?: http://www.eyeforimages.com/Portfolio/Scapes/14776548_Sjqe9/7/1101801011_63K7F#1101801011_63K7F-A-LB

    Did you rename this gallery or something? The redirects should go away and we will start pointing to the original image on the next update. Speaking of which, the update to sitemap-images are expensive and there are a LOT of them to do in a given day, so they get queued up. How long it depends really can vary quite a bit.

    The system does not currently allow us to set the uploaded filename in the URL and if it did, it would probably end up being a trade-off between that and URL size. I'd be inclined to think that shorter URLs with more meta-data would index better then the corresponding longer one. But, its a good suggestion to look at and when we start doing more work in this code I'll make sure its something we take a closer look at - thanks!

    I agree with you on image sizes, in fact I'm inclined to want to include all of the sizes, because in image search a user can select which size images they want to see.

    Hopefully I answered your questions, though I will admit that I'm a bit pressed for time at the moment. Let me know if I need to follow-up and clarify anything - and my apologies ahead of time if I missed anything. :)

    - Greg
  • Options
    eyeforimageseyeforimages Registered Users Posts: 29 Big grins
    edited March 22, 2011
    Cheers Jon, really appreciate you getting back to me on this. :)

    You did miss one question, around the landing_page_loc, the current one location only works for some smugmug templates where there are index thumbs and an image to view. When you have a sheet of thumbs for the page like I do, then you end up being dumped on the page which could have a stack of images on there. Having the light box pop up with the image would be more impressive for people hopping onto your site and it's then independent of the template you are using (I think).

    A general show of hands if anyone is viewing this thread? I have a lot of pages indexed from the site, but those are pages and don't get hits on the images side, but none of the images from my images sitemap are indexed, I have images from my blogger indexed, images from flickr, but as of yet, nothing from Smugmug...
    On google webmaster tools, what is the percentage of images that are in the index for you?
    If there was a recipe that you followed to get your images in the index. What was it?
    Paul Stoakes
    Eye For Images
    Site: http://www.eyeforimages.com
    Blog: http://blog.eyeforimages.com
  • Options
    Bob_ABob_A Registered Users Posts: 92 Big grins
    edited March 22, 2011
    Same boat here,
    I have zero or close to zero images indexed from my Smug site and almost all from my blog.
    I have yet to find the magic formula!
    I like having my site found in a google search, but the thing is ... my site is for hosting IMAGES.
    If our images are not found as well and indexed by Google, it is a big disadvantage to sites like Flickr or Pbase that have both.

    I know there is alot that must be involved and I have read through alot of the SEO stuff.
    I am to the point where I have just given up with the images being indexed here at SmugMug.

    Andy keeps writing that if we do all the stuff in the SEO tips thread of the forum, that we will see our SEO improve.

    I worked rather hard at implementing all the things mentioned in the thread and still my images do not appear.

    I have just given up!

    It is in the hands of SmugMug and I will wait patiently for things to get better as far as images being indexed.

    The sitemaps work and I appreciate that, I just need to figure out how to get my images indexed and I will be happy!

    Same for me. Sitemap-images shows 1195 urls submitted and zero indexed for my site. I have a few of my images show up under Google images, only because they've been picked up by Fwix.

    Kinda strange that it's so hard to get images indexed when they appear on an image sharing site! :D
  • Options
    BaldyBaldy Registered Users, Super Moderators Posts: 2,853 moderator
    edited April 1, 2011
    Apologies if I missed this, but it appears we do something that's non-obvious to most people. It sounds like we need to do a much better job of making it known, or we need to change it.

    And it is that when you turn on right click protection, we don't let Google index your images.

    The reason is it's a security back door. Google will index your image, but when people view them on Google's image search results, they'd be able to right-click and download from there.

    Another thing, which we're changing, is how the URL is formed without the original-image-name.jpg at the end. That change is in test right now and should help (the change will not be applied to hide owner galleries).
  • Options
    jfriendjfriend Registered Users Posts: 8,097 Major grins
    edited April 1, 2011
    Baldy wrote: »
    Apologies if I missed this, but it appears we do something that's non-obvious to most people. It sounds like we need to do a much better job of making it known, or we need to change it..
    That's an interesting tradeoff. I'm quite sure that this is not generally known since I didn't even know it and I hang around here as much as anyone.

    My first thought is to wonder if shutting out Google is really the right choice as a silent default when right-click protection is turned on. Right-click protection is such a weak protection in the first place, yet no Google image search results can really mess with a business. And, then nothing in the UI that explains this to you when you turn right-click protection on. That doesn't sound like most people who turn right-click protection are actually making an informed choice that's best for their business. It seems like it would actually be better if right-click protection and Google indexing were redone in the UI to be connected/related such that there's no way someone could turn on right-click protection and not get the result they intended for image search. Ideally, they could also control what happens with Google image search themselves (rather than Smugmug forcing it on them) and understand the consequences of what they were doing.
    --John
    HomepagePopular
    JFriend's javascript customizationsSecrets for getting fast answers on Dgrin
    Always include a link to your site when posting a question
  • Options
    denisegoldbergdenisegoldberg Administrators Posts: 14,240 moderator
    edited April 1, 2011
    Baldy wrote: »
    And it is that when you turn on right click protection, we don't let Google index your images.
    I didn't have a clue that right-click protection would turn off the ability for Google to index the images. It's weak protection anyway, but I like the reminder popping up.

    I'm well aware that the images can be grabbed - and since they can be grabbed with right click protection on or off I wish you wouldn't use this as an indication that the images shouldn't be indexed. While images can be easily saved from a Google search, shouldn't it be my choice to leave the mild warning on a right-click save from smug but not worry about the ability for images to be saved when I choose to make my images available elsewhere, whether that elsewhere is on my blog or as the result of a google search?

    --- Denise
  • Options
    AllenAllen Registered Users Posts: 10,011 Major grins
    edited April 1, 2011
    Baldy wrote: »
    ..
    And it is that when you turn on right click protection, we don't let Google index your images.
    ..).
    Holy $&^# Batman, everyone has been complaining that none of their
    Smugmug photos show up in Google image search, only from blogs etc. So
    this is the problem, right-click protection on?
    Al - Just a volunteer here having fun
    My Website index | My Blog
  • Options
    TwoofyTwoofy Registered Users Posts: 171 Major grins
    edited April 3, 2011
    I would not say it is "the" issue, but if right-click protection is turned on there is no way that Google will be able to grab your images. I was surprised to discover this too, to be honest :)


    - Greg
Sign In or Register to comment.