Robots.txt blocking 90% of Google Indexing

CoffeehoundCoffeehound Registered Users Posts: 3 Beginner grinner
edited April 13, 2014 in SmugMug Support
Without any intervention on my part Google has deleted about 90 percent of my sites pages from the index. It appears to coincide with a sudden change in our my robots.txt file which declared most of our pages as blocked urls. The robots.txt file seems to be back to normal but we continue to lose pages. I've seen similar reports but no indication whether it's been figured out or when it will be fixed. Please let me know what's happening.
Thanks

Comments

  • TeachTeach Registered Users Posts: 321 Major grins
    edited March 27, 2014
    We are looking into this issue with robots.txt and Google indexing and do not have a resolution for it. Thank you for your continued patience.
    Heather
    SmugMug Support Hero
  • tomoscotttomoscott Registered Users Posts: 92 Big grins
    edited March 27, 2014
    Teach wrote: »
    We are looking into this issue with robots.txt and Google indexing and do not have a resolution for it. Thank you for your continued patience.


    I went to a free sitemap generator site - http://www.freesitemapgenerator.com/quick-sitemap.html - just to see what would happen. I have over 15 galleries on my site, a blog, and other pages. Here is the sitemap generated by that site. It found my home page, plus one other page. That means every other page on my site isn't being indexed, right?

    I view this as an extremely serious problem. I'm glad it appears that SM has 3 engineers on it. I hope you guys resolve this quickly!
    <?xml version="1.0" encoding="UTF-8"?>
    <!-- Google Site Map File Generated by http://www.freesitemapgenerator.com/ at Thu, 27 Mar 2014 20:56:54 +0100 -->
    <urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
      <url>
        <loc>http://www.tomoscott.com/</loc>
        <lastmod>2014-03-27T20:56:54+00:00</lastmod>
      </url>
      <url>
        <loc>http://www.tomoscott.com/Galleries/Weeping-Walls-of-Torrey-Pines</loc>
        <lastmod>2014-03-27T20:56:54+00:00</lastmod>
      </url>
    </urlset>
    
    
  • Hikin' MikeHikin' Mike Registered Users Posts: 5,467 Major grins
    edited March 27, 2014
    tomoscott wrote: »
    I went to a free sitemap generator site - http://www.freesitemapgenerator.com/quick-sitemap.html - just to see what would happen. I have over 15 galleries on my site, a blog, and other pages. Here is the sitemap generated by that site. It found my home page, plus one other page. That means every other page on my site isn't being indexed, right?

    I view this as an extremely serious problem. I'm glad it appears that SM has 3 engineers on it. I hope you guys resolve this quickly!
    <?xml version="1.0" encoding="UTF-8"?>
    <!-- Google Site Map File Generated by http://www.freesitemapgenerator.com/ at Thu, 27 Mar 2014 20:56:54 +0100 -->
    <urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
      <url>
        <loc>http://www.tomoscott.com/</loc>
        <lastmod>2014-03-27T20:56:54+00:00</lastmod>
      </url>
      <url>
        <loc>http://www.tomoscott.com/Galleries/Weeping-Walls-of-Torrey-Pines</loc>
        <lastmod>2014-03-27T20:56:54+00:00</lastmod>
      </url>
    </urlset>
    
    

    Did you upload your own sitemap? If so how?
  • tomoscotttomoscott Registered Users Posts: 92 Big grins
    edited March 27, 2014
    Did you upload your own sitemap? If so how?

    That site doesn't require that you upload your own. I simply crawls your home page and creates a sitemap from that.
  • Hikin' MikeHikin' Mike Registered Users Posts: 5,467 Major grins
    edited March 27, 2014
    tomoscott wrote: »
    That site doesn't require that you upload your own. I simply crawls your home page and creates a sitemap from that.

    I understand that it crawls your home page and creates a sitemap. I thought you actually uploaded that sitemap.

    Got me thinking though. I can't FTP my own sitemap using FileZilla Client, but I can upload a sitemap via my cPanel. mwink.gif
  • Hikin' MikeHikin' Mike Registered Users Posts: 5,467 Major grins
    edited March 27, 2014
    Just to add, I didn't do that though. I'm just going to see what happens. Besides, my Wordpress site does better as far as indexing anyway.
  • CoffeehoundCoffeehound Registered Users Posts: 3 Beginner grinner
    edited March 27, 2014
    The robots.txt started blocking smugmug customer's pages from being indexed the last week of June? Sitewide? And no one informed us? And it still isn't resolved....9 months later?
    .....I'm speechless
  • photosbygerryphotosbygerry Registered Users Posts: 36 Big grins
    edited March 27, 2014
    The freesitemapgenerator site notes that "Note: this quick tool only uses two of your website's pages." which seems to explain why Tomoscott's full site was not site mapped. To get the full site you need to have an account there.
    Gerry

    photosbygerry.smugmug.com
  • tomoscotttomoscott Registered Users Posts: 92 Big grins
    edited March 27, 2014
    The freesitemapgenerator site notes that "Note: this quick tool only uses two of your website's pages." which seems to explain why Tomoscott's full site was not site mapped. To get the full site you need to have an account there.

    Right you are. I found another one here -- http://www.web-site-map.com/xml_sitemap.php -- and it created a sitemap with 35 pages, which is what it should have.
  • CoffeehoundCoffeehound Registered Users Posts: 3 Beginner grinner
    edited March 27, 2014
    The sitemaps issue seemed to resolve in October (the gold line in my graph), unfortunately we are still losing pages.
  • AperturePlusAperturePlus Registered Users Posts: 374 Major grins
    edited March 28, 2014
    @Coffeehound.. I noticed a weird discrepancy going on in my stats and put a post up this morning: http://www.dgrin.com/showthread.php?t=246165.

    Could this explain my issue do you think?
  • Star Path ImagesStar Path Images Registered Users Posts: 14 Big grins
    edited April 3, 2014
    The robots.txt started blocking smugmug customer's pages from being indexed the last week of June? Sitewide? And no one informed us? And it still isn't resolved....9 months later?
    .....I'm speechless

    I have to agree. This is simply not acceptable. I can tell you firsthand this is costing me business. Hello smugmug?
  • rainforest1155rainforest1155 Registered Users Posts: 4,566 Major grins
    edited April 3, 2014
    There's an extensive thread on SEO issues along with replies from Baldy and our engineers. Check out the most recent post from giberti for the current details.

    Tomoscott, I don't see any issue on your site. It has a very current sitemap with lots of details.

    Coffeehound, looking at the graph you included in your first post, shows almost 0 issues with robots.txt for many months. While there have been recent indexing issues, they weren't robots.txt related. Please read through the thread I linked to above for details and what we've done to address the matter. In fact, there have been first reports on that thread that the matters have already taken effect for some.

    Star Path Images, what you quote from Coffeehound is something that's not actually the case. For details, please see the post I linked to above (and if you wish for more details, you can read the thread as well).
    Sebastian
    SmugMug Support Hero
  • ablichterablichter Registered Users Posts: 294 Major grins
    edited April 13, 2014
    You guys should check out the native sitemaps generated by smugmug for your sites.

    Other sitemap-generators might be blocked by robots.txt unless they appear as "googlebot" or another allowed useragent.

    Append
    /sitemap-base.xml
    /sitemap-galleryimages.xml
    to you URL oder download them zipped:
    /sitemap-base.xml.gz
    /sitemap-galleryimages.xml.gz
Sign In or Register to comment.