This is not correct. Googlebot etc. get a version of the page with javascript content already rendered into the html. I dunno what useragent that webconfs thing is using, but if you change the useragent in Chrome developer tools to the googlebot useragent, you can see what google sees.
Well, I beg to differ. Have you looked at the actual source code being served up (ctrl+u in Chrome)? It's Javascript not html and that's what the bots are seeing. Chrome is translating it into HTML, but bots are rather dumb and don't do that type of translation.
Basically yes. Google is getting better at crawling non-html links (links that are embedded in javascript, a flash object & other non-html code) but there's no guarantee. Crawlable links allow Google (and other bots) to create a relationship between one page to the next. These links also expose new URLs that might have been missed in the sitemap or on previous bot visits.
So if you use a standard, non-dropdown, link oriented menu you're good to go? The same if you leave off the slideshow & keyword cloud?
Well, I beg to differ. Have you looked at the actual source code being served up (ctrl+u in Chrome)? It's Javascript not html and that's what the bots are seeing. Chrome is translating it into HTML, but bots are rather dumb and don't do that type of translation.
Well, It doesnt sound like you changed the user-agent to googlebot like i said.
Make sure you change the user-agent to googlebot, then view source. Its html. If you just view a page in chrome, you're gonna get the pure javascript version.
here is your site after changing the user-agent to googlebot:
you'll notice we indicate we identify googlebot, and all your meta info is there:
Well, It doesnt sound like you changed the user-agent to googlebot like i said.
"View Source" shows you exactly what Googlebot will see when it views the code. Like I said earlier, Googlebot has gotten better at crawling javascript and other non-html objects (Flash, etc..), but I haven't seen any proof that Googlebot can crawl and then process javascript without any issues. Maybe you have other proof that it can? If so, I'm all ears...
"View Source" shows you exactly what Googlebot will see when it views the code. Like I said earlier, Googlebot has gotten better at crawling javascript and other non-html objects (Flash, etc..), but I haven't seen any proof that Googlebot can crawl and then process javascript without any issues. Maybe you have other proof that it can? If so, I'm all ears...
I'm not sure how else to explain this to you. I'm clearly not doing a very good job of it.
When googlebot visits your site, it gets the HTML rendered version (see the screenshots I posted), not the javascript version that you get when visiting from a browser.
I'm not sure how else to explain this to you. I'm clearly not doing a very good job of it.
When googlebot visits your site, it gets the HTML rendered version (see the screenshots I posted), not the javascript version that you get when visiting from a browser.
No, I totally understand what you're saying. What I'm talking about is the fact that a bot does not render content, it merely captures source code. The browser is the element that has a rendering engine. This allows it to render javascript code. Bots do not have that ability. They only capture code. Apparently Google now has ability to take the javascript code that the bot has captured and use it's own rendering engine to try to find links, etc.. The jury is still out as to how well they do this.
A sample of the link code that the bot sees is this (form my site):
"Url":"http:\/\/www.chipjonesphotography.com\/FineArt\/Abstract"
This is what the bot sees. It's not HTML and the bot doesn't translate the code into HTML, they merely capture the code. If Google can determine that this is a link (a backend process), then the bot will crawl that link to capture the source code there. If Google can't determine that this is a link they need to follow, then it's ignored. This is what I'm talking about.
I'm afraid you don't understand, and that might be because the screenshots aren't showing. Does this make sense now?
Again, I know this technique very well, but Googlebot is not a browser, nor does it render code. I guess I'll just leave it there. I do not share your feeling that New SmugMug content is being indexed as well as it could, and am confident that the New SmugMug implementation, with it's javascript based widget solution is the primary culprit. If you're happy with SmugMug, then great. I am not and will likely move on (for a number of reasons)...
Again, I know this technique very well, but Googlebot is not a browser, nor does it render code.
This I think is where we are not seeing eye to eye. We aren't asking googlebot to render anything. SmugMug servers pre-render the javascript into html and send that html to googlebot. Googlebot sees the html, it doesn't need to process the javascript into html like the browsers do.
The links I shared earlier were of the source of the page sent from the server when I told Chrome to identify itself as Googlebot. Not the source after javascript executed. If you compare that to the view-source of the page when just letting Chrome identify itself as Chrome you will see, as you you have pointed out, that the source *is* mostly javascript that needs to be executed (rendered), and you are correct that this is not good food for Googlebot. But, again, this is *not* the same source that gets sent to Googlebot. Googlebot gets html.
I think you guys (ghipj, bwg) are going a bit too far with technicalities which don't help us average SM users to understand if and how we could improve SEO performances on our SmugMug websites huh
I think you guys (ghipj, bwg) are going a bit too far with technicalities which don't help us average SM users to understand if and how we could improve SEO performances on our SmugMug websites huh
Well, there was speculation going on so in the interest of making sure the community had the proper information, some technical explanation was needed. My apologies for not being as clear as I could have been initially.
Fill in the appropriate SEO boxes in your account settings (settings > discovery > search) and we will do the rest.
Any content blocks you use will be available to search bots (menus, keyword clouds, etc.)
Well, there was speculation going on so in the interest of making sure the community had the proper information, some technical explanation was needed. My apologies for not being as clear as I could have been initially.
Fill in the appropriate SEO boxes in your account settings (settings > discovery > search) and we will do the rest.
Any content blocks you use will be available to search bots (menus, keyword clouds, etc.)
Well, saying SM will "do the rest" is a bit oversimplifying things. There are very basic and large SEO problems with the new SM. This thread is a great example:
Well, there was speculation going on so in the interest of making sure the community had the proper information, some technical explanation was needed. My apologies for not being as clear as I could have been initially.
Fill in the appropriate SEO boxes in your account settings (settings > discovery > search) and we will do the rest.
Any content blocks you use will be available to search bots (menus, keyword clouds, etc.)
Please bear with me if I sounded rude, I didn't mean to ; I'm sure all the things you were discussing are well worth. But from the point of view of less SEO skilled SmugMug users as myself, which I believe are by far the widest part, the discussion had come to a point it could only add more confusion to our already confused minds about SEO issues.
Any content blocks you use will be available to search bots (menus, keyword clouds, etc.)
The thread started with a confutation on this point while you are saying those contents are not affecting SEO; it would be a good thing if SM staff could spend a clarifying word on this.
The thread started with a confutation on this point while you are saying those contents are not affecting SEO; it would be a good thing if SM staff could spend a clarifying word on this.
I am SM staff, and clarifying the speculation on content blocks was the reason I joined the thread
bwg, can you tell me what user agent string you use for google image search? For standard google search I use 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)' which seems to work fine but I have less success finding something that works reliably with google image search. Often I get time outs retrieving the page.
I should say however that it's been a few weeks since I've looked in detail at this though as for me the google image and web search seems to work. My only slight confusion is that Google Webmaster tools says that 154 images have been submitted and none indexed but google image search is showing 124 of the images
bwg, can you tell me what user agent string you use for google image search? For standard google search I use 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)' which seems to work fine but I have less success finding something that works reliably with google image search. Often I get time outs retrieving the page.
I should say however that it's been a few weeks since I've looked in detail at this though as for me the google image and web search seems to work. My only slight confusion is that Google Webmaster tools says that 154 images have been submitted and none indexed but google image search is showing 124 of the images
Thanks for your input on this bwg
Rich
Would also like to know about this. Never understood the correlation.
Google brings up a few images under search while webmaster doesn't say any are indexed.
I am SM staff, and clarifying the speculation on content blocks was the reason I joined the thread
bwg, I think you and the rest of the SM dev staff should review this and then tell me if what I've said is misleading. Not looking to battle, just trying to help SM get a better product out there that the search engines can more easily index. Here's the documentation: https://developers.google.com/webmasters/ajax-crawling/docs/learn-more
bwg, I think you and the rest of the SM dev staff should review this and then tell me if what I've said is misleading. Not looking to battle, just trying to help SM get a better product out there that the search engines can more easily index. Here's the documentation: https://developers.google.com/webmasters/ajax-crawling/docs/learn-more
I've only been addressing the statement you made about content blocks not being visible to googlebot because they are a "javascript based widget solution". That statement is misleading and incorrect.
If that doesn't click for you, I'm not sure what else to say.
Well, I guess we're at an impasse then because there's absolutely no way I can tell if what you say you're doing (programatically) is actually what you are doing and I don't see anything associated with a HTML snapshot.
All I have to go on are the multiple issues being reported by tools like Google Webmaster Tools, Xenu Sleuth, Bright Local, and the non-html content being served up in the page source. I guess I'm imagining things...
Well, I guess we're at an impasse then because there's absolutely no way I can tell if what you say you're doing (programatically) is actually what you are doing and I don't see anything associated with a HTML snapshot.
All I have to go on are the multiple issues being reported by tools like Google Webmaster Tools, Xenu Sleuth, Bright Local, and the non-html content being served up in the page source. I guess I'm imagining things...
I've provided incontrovertible evidence of what we serve googlebot. Ignoring the evidence is certainly your prerogative.
And like I've already mentioned, the reason tools like Xenu Sleuth and that webconfs.com one you posted earlier report errors are because they don't identify themselves with a googlebot (or other recognized bot) user-agent. The Xenu Sleuth FAQ (#20) even mentions this.
1. Enter Chrome's dev tools (View > Developer > Developer Tools).
2. Click the settings icon (bottom right on a Mac).
3. Click the "Overrides" tab.
4. Click the "Enable" and "User Agent" checkboxes.
5. Select "Other" for the "User Agent" drop down and enter "googlebot" in the text field.
6. Visit your SmugMug page and witness all of the HTML they are rendering
This worked well for me but it's possible they aren't rendering HTML for sources that aren't reputable (bwg, correct me if I'm wrong). I hope this works for you!
While I was browsing the forum in search of tips for improving my website SEO, I stumbled on this thread and...well, what a shock!!! :wow:wow:wow
I just decided to not watermark my images and rely on RCP because I've always thought watermaks are visually annoying visitors and bam!, it turns out RCP can sink search results wxwax
Leaving aside the technicalities, which I am not able to fully understand and therefore discuss, I'd like to make a clear understanding of the point of this thread as it emerges from the posts so far: are you guys really saying NewSmugMug SEO sucks?
Please bear with me if I might sound rude but SEO is more than important for myself (as well as for many others, I figure) as I have to rely a lot on web based searches for my business.
I received, as usual, many (interesting and convenient) offers from other platforms before I decided keep staying (I had a SM legacy site) with SmugMug and, to be honest, the main reason that made me confident to go ahead again with them was that I have been rather happy with how SEO worked about my legacy site.
Now, reading your worries about actual SEO performances of NewSmugMug has obviously made me quite nervous.
I feel your pain about the RCP thing. What's really been ticking me off about that whole thing is that they had decided a couple years ago (wisely, imho) to stop keeping our RCPed images from Google, after talking to we the users!... uh, the people who are paying for this service. BUT then they evidently reversed the decision, told no one, and are now completely refusing to own up to it, to discuss it with us (yes, us, the same users who are now being charged much higher rates for the service but getting many fewer images served up to Google).
Months ago I begged and begged for an official SmugMug response about the decision's reversal. All we got was complete silence. No word as to 1. IF they consciously reversed the decision 2. If consciously, why did they do that after consulting with people and finding out the current procedure is not wanted?? 3. If not consciously, but because some designers of all the new stuff didn't know the history on that decision, where is the explanation / discussion of all this with those of us who are paying for this crappy situation where RCP ruins our SEO? My hunch for a long time is that SmugMug & Google have some problem/issue between them that isn't or hasn't been resolved in ways that do us any good. What I hear/read between the lines is that some part of this is an embarrassment. There are several odd Google issues that simply are not being explained & have gone unresolved for years. (Maps is only one of them)
And now this SEO/ RCP thing that affects so many of us exists (& lots of SmugMuggers probably have no clue!) & we get no satisfaction. It feels very rude, in ways that would not have happened at SmugMug even a couple years ago. I feel completely disregarded, actually. Here I have this great option of using RCP in some places where I want to and choose to. And yet the official SmugMug policy currently (evidently) is to not talk about any of its effect on my SEO, something that's important to me. It seems they think these questions will just go away. But they won't. I think about it every day & wonder, & and it's not going away. It makes me so tempted to leave and find a place where honesty and transparency and real concern for the things we care about are still the norm.
5. Select "Other" for the "User Agent" drop down and enter "googlebot" in the text field
This does indeed work perfectly for web search. For image search I've been using 'Googlebot-Image/1.0' but I often find I get timeouts waiting for the page to load. Do you happen to know if that is the correct UA? (EDIT: no timeouts today, I wonder if there was a problem before)
I've provided incontrovertible evidence of what we serve googlebot. Ignoring the evidence is certainly your prerogative.
And like I've already mentioned, the reason tools like Xenu Sleuth and that webconfs.com one you posted earlier report errors are because they don't identify themselves with a googlebot (or other recognized bot) user-agent. The Xenu Sleuth FAQ (#20) even mentions this.
The proof is in the search metrics. I guess it's just time to move on...
Just an FYI about that. I have RCP set to ON in all but one gallery. The other day I saw someone did a Google Image search for "Chinese algae eater and discus." The search turned up a bunch of photos from my site which are RCPed.
So images do indeed appear in image search when using RCP. They appear to be smaller thumbs as found via a keyword search page. "Visit page" opens the lightbox. "View image" opens just the image, with .JPG in the URL. It can then be snagged by a viewer using Right Click. It also doesn't count towards traffic in google stats. It does have my watermark on it.
Also, the interesting thing is if they click "Visit Page" on any thumbnail from the gallery other than the one for the actual Chinese Algae Eater, say one of a discus fish, the Chinese Algae Eater photo is the one that opens in the lightbox.
Just an FYI about that. I have RCP set to ON in all but one gallery. The other day I saw someone did a Google Image search for "Chinese algae eater and discus." The search turned up a bunch of photos from my site which are RCPed.
I just tried a Google images search for "Chinese algae eater" and none of your images turned up; although I only checked the first 50-60.
So images do indeed appear in image search when using RCP.
They do indeed. All my photographs have RCP and always have had and I can find them very quickly when doing an image search on any browser. A lot of the things I photograph are hard to find so I can usually find them on the first results page but they are always there somewhere.
The search I mentioned was "Chinese Algae Eater and Discus.". The "and discus" must have made my images unique enough for them to appear towards the top. I am not at all surprised that just "Chinese algae eater" doesn't show hits as that gallery is very new. I only reprocessed the photos, and uploaded them in the past month or two.
Comments
So if you use a standard, non-dropdown, link oriented menu you're good to go? The same if you leave off the slideshow & keyword cloud?
Well, It doesnt sound like you changed the user-agent to googlebot like i said.
Make sure you change the user-agent to googlebot, then view source. Its html. If you just view a page in chrome, you're gonna get the pure javascript version.
here is your site after changing the user-agent to googlebot:
you'll notice we indicate we identify googlebot, and all your meta info is there:
here are your menus:
and your keyword cloud:
I'm not sure how else to explain this to you. I'm clearly not doing a very good job of it.
When googlebot visits your site, it gets the HTML rendered version (see the screenshots I posted), not the javascript version that you get when visiting from a browser.
No, I totally understand what you're saying. What I'm talking about is the fact that a bot does not render content, it merely captures source code. The browser is the element that has a rendering engine. This allows it to render javascript code. Bots do not have that ability. They only capture code. Apparently Google now has ability to take the javascript code that the bot has captured and use it's own rendering engine to try to find links, etc.. The jury is still out as to how well they do this.
A sample of the link code that the bot sees is this (form my site):
"Url":"http:\/\/www.chipjonesphotography.com\/FineArt\/Abstract"
This is what the bot sees. It's not HTML and the bot doesn't translate the code into HTML, they merely capture the code. If Google can determine that this is a link (a backend process), then the bot will crawl that link to capture the source code there. If Google can't determine that this is a link they need to follow, then it's ignored. This is what I'm talking about.
I'm afraid you don't understand, and that might be because the screenshots aren't showing. here are direct links to what I was referring to before
Where we identify googlebot and render the proper meta tags:
https://www.evernote.com/shard/s212/sh/9dd96350-d71e-43bd-a217-7cce7b73d5fa/64d13488a0d1d039ebd4cc7f1f03d98c
How googlebot sees your dropdown menu:
https://www.evernote.com/shard/s212/sh/6ee9512d-b0cb-491d-8e19-532ddcb9a25e/e0dcd076d5d5436de1bb03b21707af3e
And how googlebot sees your keywords:
https://www.evernote.com/shard/s212/sh/5ade2fa6-0e6e-4243-888a-11f66414ab9c/4c8cda0adcbb5eb3d3e6ba16cc041673
All html, as seen by view-source.
Does this make sense now?
This I think is where we are not seeing eye to eye. We aren't asking googlebot to render anything. SmugMug servers pre-render the javascript into html and send that html to googlebot. Googlebot sees the html, it doesn't need to process the javascript into html like the browsers do.
The links I shared earlier were of the source of the page sent from the server when I told Chrome to identify itself as Googlebot. Not the source after javascript executed. If you compare that to the view-source of the page when just letting Chrome identify itself as Chrome you will see, as you you have pointed out, that the source *is* mostly javascript that needs to be executed (rendered), and you are correct that this is not good food for Googlebot. But, again, this is *not* the same source that gets sent to Googlebot. Googlebot gets html.
Venice PhotoBlog
Fill in the appropriate SEO boxes in your account settings (settings > discovery > search) and we will do the rest.
Any content blocks you use will be available to search bots (menus, keyword clouds, etc.)
Well, saying SM will "do the rest" is a bit oversimplifying things. There are very basic and large SEO problems with the new SM. This thread is a great example:
http://www.dgrin.com/showthread.php?t=243019
Duplicate meta descriptions, and particularly duplicate title tags, are big negatives for ranking.
Please bear with me if I sounded rude, I didn't mean to ; I'm sure all the things you were discussing are well worth. But from the point of view of less SEO skilled SmugMug users as myself, which I believe are by far the widest part, the discussion had come to a point it could only add more confusion to our already confused minds about SEO issues. What do you exactly mean with this? The thread started with a confutation on this point while you are saying those contents are not affecting SEO; it would be a good thing if SM staff could spend a clarifying word on this.
Venice PhotoBlog
Venice PhotoBlog
I should say however that it's been a few weeks since I've looked in detail at this though as for me the google image and web search seems to work. My only slight confusion is that Google Webmaster tools says that 154 images have been submitted and none indexed but google image search is showing 124 of the images
Thanks for your input on this bwg
Rich
Would also like to know about this. Never understood the correlation.
Google brings up a few images under search while webmaster doesn't say any are indexed.
site - http://www.bay-photography.com/
blog - http://bayphotos.blogspot.com/
However, this appears relevant to your questions regarding submitted vs. indexed images
https://support.google.com/webmasters/answer/178636?hl=en
https://developers.google.com/webmasters/ajax-crawling/docs/learn-more
I've only been addressing the statement you made about content blocks not being visible to googlebot because they are a "javascript based widget solution". That statement is misleading and incorrect.
Borrowing terminology from the documentation you referenced, we *do* create a "HTML Snapshot" for googlebot. In fact, we use a process similar in function to #3 here: https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot
If that doesn't click for you, I'm not sure what else to say.
All I have to go on are the multiple issues being reported by tools like Google Webmaster Tools, Xenu Sleuth, Bright Local, and the non-html content being served up in the page source. I guess I'm imagining things...
I've provided incontrovertible evidence of what we serve googlebot. Ignoring the evidence is certainly your prerogative.
And like I've already mentioned, the reason tools like Xenu Sleuth and that webconfs.com one you posted earlier report errors are because they don't identify themselves with a googlebot (or other recognized bot) user-agent. The Xenu Sleuth FAQ (#20) even mentions this.
Try the following to see what googlebot sees…
1. Enter Chrome's dev tools (View > Developer > Developer Tools).
2. Click the settings icon (bottom right on a Mac).
3. Click the "Overrides" tab.
4. Click the "Enable" and "User Agent" checkboxes.
5. Select "Other" for the "User Agent" drop down and enter "googlebot" in the text field.
6. Visit your SmugMug page and witness all of the HTML they are rendering
This worked well for me but it's possible they aren't rendering HTML for sources that aren't reputable (bwg, correct me if I'm wrong). I hope this works for you!
FastLine Media • SmugMug Customization • Website Design • Logo Design • WordPress Page Builder • Twitter
Months ago I begged and begged for an official SmugMug response about the decision's reversal. All we got was complete silence. No word as to 1. IF they consciously reversed the decision 2. If consciously, why did they do that after consulting with people and finding out the current procedure is not wanted?? 3. If not consciously, but because some designers of all the new stuff didn't know the history on that decision, where is the explanation / discussion of all this with those of us who are paying for this crappy situation where RCP ruins our SEO? My hunch for a long time is that SmugMug & Google have some problem/issue between them that isn't or hasn't been resolved in ways that do us any good. What I hear/read between the lines is that some part of this is an embarrassment. There are several odd Google issues that simply are not being explained & have gone unresolved for years. (Maps is only one of them)
And now this SEO/ RCP thing that affects so many of us exists (& lots of SmugMuggers probably have no clue!) & we get no satisfaction. It feels very rude, in ways that would not have happened at SmugMug even a couple years ago. I feel completely disregarded, actually. Here I have this great option of using RCP in some places where I want to and choose to. And yet the official SmugMug policy currently (evidently) is to not talk about any of its effect on my SEO, something that's important to me. It seems they think these questions will just go away. But they won't. I think about it every day & wonder, & and it's not going away. It makes me so tempted to leave and find a place where honesty and transparency and real concern for the things we care about are still the norm.
DayBreak, my Folk Music Group (some free mp3s!) http://daybreakfolk.com
This does indeed work perfectly for web search. For image search I've been using 'Googlebot-Image/1.0' but I often find I get timeouts waiting for the page to load. Do you happen to know if that is the correct UA? (EDIT: no timeouts today, I wonder if there was a problem before)
Just an FYI about that. I have RCP set to ON in all but one gallery. The other day I saw someone did a Google Image search for "Chinese algae eater and discus." The search turned up a bunch of photos from my site which are RCPed.
So images do indeed appear in image search when using RCP. They appear to be smaller thumbs as found via a keyword search page. "Visit page" opens the lightbox. "View image" opens just the image, with .JPG in the URL. It can then be snagged by a viewer using Right Click. It also doesn't count towards traffic in google stats. It does have my watermark on it.
Also, the interesting thing is if they click "Visit Page" on any thumbnail from the gallery other than the one for the actual Chinese Algae Eater, say one of a discus fish, the Chinese Algae Eater photo is the one that opens in the lightbox.
I just tried a Google images search for "Chinese algae eater" and none of your images turned up; although I only checked the first 50-60.
Venice PhotoBlog
They do indeed. All my photographs have RCP and always have had and I can find them very quickly when doing an image search on any browser. A lot of the things I photograph are hard to find so I can usually find them on the first results page but they are always there somewhere.