Last week, the transit provider (Limelight/LLNW) that connects one of our routes with BT Internet reported that they are having problems in the London area. I can't tell if this impacts you directly, but in my status update request for today I have added your route to see if it is related. I apologize for not posting this information here, please feel free to ping on this thread as much as you'd like to get status (I do watch the forum daily, as well as my email constantly).
Your output might be bad. Some of the packets that traceroute uses to calculate time are being blocked (intentionally or not) at a certain segment in your route within the UK. This shouldn't be taking this long to fix, but the last time we had issues with UK based customers, it was due to a bug in the switch software on one of the links. Your issue might be entirely different, we shall see once I get word back from the folks that run the network closer to you.
Stay tuned, you're back on my radar.
How does one 'ping' on a thread?
Does it mean just replying again, like this?
I'm sorry to hear this is still a problem, I never got any response back from BTL/Limelight/et al on what was up in London. Naturally, other issues stole my attention so this slipped from the front of my mind.
Checking back at the IP's indicated in the thread, it would be nice if the international folks having issues could please post or PM me their current public IP, I would appreciate it. This way i can confirm I'm working with fresh data.
A traceroute TO us wouldn't hurt, however the important one is our route back to you which I can provide.
I'm sorry to hear this is still a problem, I never got any response back from BTL/Limelight/et al on what was up in London. Naturally, other issues stole my attention so this slipped from the front of my mind.
Checking back at the IP's indicated in the thread, I'm having difficulties tracerouting to your networks. This may or may not be a problem, however it would be nice if the international folks having issues could please post or PM me their current public IP, I would appreciate it. This way i can confirm I'm working with fresh data.
A traceroute TO us wouldn't hurt, however the important one is our route back to you which I can provide.
Can you run a traceroute to my machine? You can use my website address below as the site runs on my workstation where I upload photos.
I find SM quite slow even from the University of British Columbia (Canada) which has a VERY big pipe. I routinely (2-3 times a week) download 10 GB in a few hours from a site in the DC area.
Of course other things can also be a factor.
* content on each page (big images, etc.)
* content caching (both on the browser side and on your systems)
* content complexity (lots of JS and CSS can slow a browser down)
* I assume the pages are generated dynamically(?) and that takes time as well
Perhaps we should ask people to time the SM home page (www.smugmug.com) in addition to their pages as that will confim (or fail to confirm) internet issues.
For me it takes about 3 seconds from clicking GO to finish the display on FF under Windows.
Using Linux I was able to determine that the process of loading the HTML portion of the main page takes less than 1 second of wall time (including time to write the results to the disk). This is compared to more than 2 seconds for my home page.
A page far more complex like www.cnn.com takes just over 1 second. My SM site is not nearly as complicated as the cnn.com site.
It could be a pipe issue (SM just needs bigger pipes) or the problem may be on SM's end. It may be as simple as the servers taking too long to generate the dynamic pages. The heavier the load the longer it will take regardless of the size of the internet pipes.
I'm sorry to hear this is still a problem, I never got any response back from BTL/Limelight/et al on what was up in London. Naturally, other issues stole my attention so this slipped from the front of my mind.
Checking back at the IP's indicated in the thread, it would be nice if the international folks having issues could please post or PM me their current public IP, I would appreciate it. This way i can confirm I'm working with fresh data.
A traceroute TO us wouldn't hurt, however the important one is our route back to you which I can provide.
OK, do I have to do anything?
Is there some info you need from me - I'm sorry for sounding dumb but a lot of the terms you use are over my head
More timing data
My SM site seems very slow (FF 1.5.0.3 on Linux RH WS 4) so I decided to run some time checks.
Using three runs:
>time wget mpmcleod.smugmug.com/PhotosByDate
real 1m54.692s
>time wget mpmcleod.smugmug.com/PhotosByDate
real 0m14.905s
>time wget mpmcleod.smugmug.com/PhotosByDate
real 0m47.825s
The size of the file is less than 18kb. This is VERY slow. 15 seconds is slow, but almost 2 minutes to download 18 kb!?!?!?
This is only the HTML, what one would see if they selected "view source" so this doesn't count image downloads, or CSS or JS execution.
I can still download the main page (roughly the same size) in about 1 second. This leads me to believe that the slowdown is on the backend server which is generating these pages.
I hope this helps and we can get SM running faster.
I appreciate your assistance in this Mike, but I'd like to not intermix the issues here. We have two distinct ones we're tracking: Network, and acknowledged database challenges (http://www.dgrin.com/showthread.php?t=35292). This one is overseas network performance, which is probably centered around suboptimal routing (BGP can be dumb) or remote ISP's having trouble. Yes, they can be related overall, but there are enough variables already at play.
I'll have the title of this thread modified to help make it more distinct.
oh, and I forgot to answer your last suggestion on getting bigger pipes: We already have, our current network utilization is less than 10% of our installed capacity (ignoring the redundant links). Before the upgrade we were probably around 30%. There are no network bottlenecks within our control at the moment.
Edit:
Although, I'm sitting in a hotel room in Chicago, and I'm getting between 5 and 6 seconds to load your PhotosByDate page on repeated attempts using a timed wget. So this might be a network related issue for you as well.
ORD to smugmug:
traceroute to hera.smugmug.com (63.81.134.23), 64 hops max, 40 byte packets
1 2005host1 (12.160.3.1) 1.269 ms 0.500 ms 0.330 ms
2 12.160.1.193 (12.160.1.193) 1.359 ms 1.496 ms 1.599 ms
3 * * *
4 br2-a3120s10.attga.ip.att.net (12.123.20.246) 84.651 ms 85.410 ms 109.327 ms
5 tbr2-cl29.sl9mo.ip.att.net (12.122.10.138) 86.623 ms 86.170 ms 86.212 ms
6 tbr1-cl24.sl9mo.ip.att.net (12.122.9.141) 88.014 ms 88.247 ms 87.898 ms
7 tbr2-cl2.sffca.ip.att.net (12.122.10.42) 87.823 ms 88.248 ms 107.440 ms
8 12.123.12.61 (12.123.12.61) 85.472 ms 85.832 ms 85.468 ms
9 12.126.40.42 (12.126.40.42) 86.179 ms 87.085 ms 87.095 ms
10 pri.r1-ge0-2-eq-sj.smugmug.com (206.223.117.70) 85.155 ms 97.153 ms 85.663 ms
Although, I'm sitting in a hotel room in Chicago, and I'm getting between 5 and 6 seconds to load your PhotosByDate page on repeated attempts using a timed wget. So this might be a network related issue for you as well.
I am now getting between 3-7 seconds. MUCH improved.
OK I did as you suggested (on my Mac) and although it means absolutely nothing to me I'm sure it makes sense to you!
Let me know the implications?
Here it is:
Last login: Sat Jun 3 18:57:51 on ttyp1
Welcome to Darwin!
simons-power-mac-g5:~ simon$ traceroute www.smugmug.com
traceroute to hera.smugmug.com (63.81.134.23), 64 hops max, 40 byte packets
1 192.168.1.1 (192.168.1.1) 0.948 ms 0.414 ms 0.389 ms
2 10.121.56.1 (10.121.56.1) 9.253 ms * *
3 * * *
4 * * *
5 * * *
OK so last time there was no comment you told me to ping you , or it, but still nada?
I'd appreciate someone at least saying 'yes we have no idea what that means-something-anything....' otherwise I don't know even if it has been read
Sorry for the delay, there just really isn't a smoking gun to clearly identify why you are having delays accessing our site. The traceroute you have posted isn't giving helpful info other than that there's some filtering (or timeouts) of the ICMP responses for each segment. I've stepped it back a few hops and traced to your ISP, but still get results within transatlantic SLA for the major providers.
Essentially, everything looks fine other than the issues we've already identified recently (database, etc).
Smugmug is dead slow for me from Germany at the moment. Takes more than a minute for a user site to show up. Custom domains like mine and Andy's and the main page/ browse page are fine though! It's just the username.smugmug.com.
I also noticed that my route to smugmug.com looks different. A couple of weeks ago I used to take the route from New York to the west coast. Now it looks like this:
Routenverfolgung zu hera.smugmug.com [63.81.134.23] über maximal 30 Abschnitte:
1 2 ms 1 ms 1 ms PC9 [192.168.0.1]
2 * * * Zeitüberschreitung der Anforderung.
3 48 ms 75 ms 290 ms 217.0.64.214
4 140 ms 141 ms 139 ms was-e4.WAS.US.net.DTAG.DE [62.154.15.38]
5 184 ms 162 ms 287 ms so0-0-0-2488m.ar1.DCA3.gblx.net [208.51.74.17]
6 225 ms 232 ms 240 ms so5-0-0-2488M.ar1.SJC2.gblx.net [67.17.67.154]
7 222 ms 224 ms 330 ms pri.r1-ge0-2-eq-sj.smugmug.com [206.223.117.70]
8 250 ms 248 ms 258 ms hera.smugmug.com [63.81.134.23]
Ablaufverfolgung beendet.
Are these changes due to your effort to decentralize the site?
Thanks,
Sebastian
PS: I don't know why the first server of my ISP timed out. Guess that has got to do with my strange problem.
EDIT: Problem is the same on IE6 and FF. I'm using www.congster.de which should be powered by www.telekom.de - the biggest provider in Germany.
Edit: Ok so the current slowness is due to a software update at Smugmug. Anyway, here's a traceroute from Finland where Smugmug is intermittently kind of sluggish...
Tracing route to hera.smugmug.com [63.81.134.23]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.0.1
2 7 ms 8 ms 7 ms so-erx1.hel.elisa-laajakaista.fi [80.186.0.1]
3 5 ms 5 ms 5 ms jnx-xdsl2-ae0.hel.kolumbus.net [193.229.6.81]
4 5 ms 6 ms 5 ms ge-1-0-0.jnx-ka.hel.kolumbus.net [193.229.255.12]
5 12 ms 12 ms 12 ms so-1-0-0.kista10.sto.kolumbus.net [193.229.1.26]
6 230 ms 215 ms 199 ms kista1.sto.kolumbus.net [62.248.254.35]
7 106 ms 105 ms 105 ms telebw1-pos1.nyc.kolumbus.net [193.229.1.98]
8 116 ms 116 ms 117 ms nyiix.he.net [198.32.160.61]
9 183 ms 184 ms 183 ms pos0-1.gsr12416.sjc2.he.net [216.218.223.133]
10 183 ms 184 ms 183 ms pri.r1-ge0-2-eq-sj.smugmug.com [206.223.117.70]
11 182 ms 181 ms 181 ms hera.smugmug.com [63.81.134.23]
Trace complete.
Comments
How does one 'ping' on a thread?
Does it mean just replying again, like this?
...pics..
Dgrin FAQ | Me | Workshops
The machine that goes bing
I love Monty Python
SmugMug API Developer
My Photos
That's sad.
Dgrin FAQ | Me | Workshops
Checking back at the IP's indicated in the thread, it would be nice if the international folks having issues could please post or PM me their current public IP, I would appreciate it. This way i can confirm I'm working with fresh data.
A traceroute TO us wouldn't hurt, however the important one is our route back to you which I can provide.
I find SM quite slow even from the University of British Columbia (Canada) which has a VERY big pipe. I routinely (2-3 times a week) download 10 GB in a few hours from a site in the DC area.
Of course other things can also be a factor.
* content on each page (big images, etc.)
* content caching (both on the browser side and on your systems)
* content complexity (lots of JS and CSS can slow a browser down)
* I assume the pages are generated dynamically(?) and that takes time as well
Perhaps we should ask people to time the SM home page (www.smugmug.com) in addition to their pages as that will confim (or fail to confirm) internet issues.
For me it takes about 3 seconds from clicking GO to finish the display on FF under Windows.
Using Linux I was able to determine that the process of loading the HTML portion of the main page takes less than 1 second of wall time (including time to write the results to the disk). This is compared to more than 2 seconds for my home page.
A page far more complex like www.cnn.com takes just over 1 second. My SM site is not nearly as complicated as the cnn.com site.
The size of the text files are:
www.smugmug.com: 12.6 kb
mpmcleod.smugmug.com: 30.8 kb
www.cnn.com: 100.2 kb
It could be a pipe issue (SM just needs bigger pipes) or the problem may be on SM's end. It may be as simple as the servers taking too long to generate the dynamic pages. The heavier the load the longer it will take regardless of the size of the internet pipes.
Just thoughts, maybe it will spark something?
smugmug nickname: mpmcleod
http://www.michaelmcleod.com/
OK, do I have to do anything?
Is there some info you need from me - I'm sorry for sounding dumb but a lot of the terms you use are over my head
...pics..
presumably you have heard of Monty Python's Flying Circus?
It's from one of thier films:-)
...pics..
AHA! The thick plottens!:):
...pics..
Windows: Start > Run > type 'cmd' w/o ticks in box, then 'tracert www.smugmug.com'
Mac: Finder > Applications > Utilities > Terminal, then type w/o ticks 'traceroute www.smugmug.com'
Paste that here or PM me if you'd like to keep your info private.
Thanks!
My SM site seems very slow (FF 1.5.0.3 on Linux RH WS 4) so I decided to run some time checks.
Using three runs:
>time wget mpmcleod.smugmug.com/PhotosByDate
real 1m54.692s
>time wget mpmcleod.smugmug.com/PhotosByDate
real 0m14.905s
>time wget mpmcleod.smugmug.com/PhotosByDate
real 0m47.825s
The size of the file is less than 18kb. This is VERY slow. 15 seconds is slow, but almost 2 minutes to download 18 kb!?!?!?
This is only the HTML, what one would see if they selected "view source" so this doesn't count image downloads, or CSS or JS execution.
I can still download the main page (roughly the same size) in about 1 second. This leads me to believe that the slowdown is on the backend server which is generating these pages.
I hope this helps and we can get SM running faster.
smugmug nickname: mpmcleod
http://www.michaelmcleod.com/
I'll have the title of this thread modified to help make it more distinct.
oh, and I forgot to answer your last suggestion on getting bigger pipes: We already have, our current network utilization is less than 10% of our installed capacity (ignoring the redundant links). Before the upgrade we were probably around 30%. There are no network bottlenecks within our control at the moment.
Edit:
Although, I'm sitting in a hotel room in Chicago, and I'm getting between 5 and 6 seconds to load your PhotosByDate page on repeated attempts using a timed wget. So this might be a network related issue for you as well.
From smugmug edge routers to ORD:
1 ge2-11.fr1.sjc.llnw.net (206.223.117.33) 0 msec 0 msec 0 msec
2 12.119.139.13 [AS 7018] 0 msec 0 msec 4 msec
3 tbr2-p011402.sffca.ip.att.net (12.123.12.38) [AS 7018] 64 msec 64 msec 64 msec
4 tbr1-cl2.sl9mo.ip.att.net (12.122.10.41) [AS 7018] 64 msec 68 msec 64 msec
5 tbr2-cl24.sl9mo.ip.att.net (12.122.9.142) [AS 7018] 64 msec 64 msec 64 msec
6 tbr2-cl29.attga.ip.att.net (12.122.10.137) [AS 7018] 68 msec 64 msec 64 msec
7 ar5-p3110.attga.ip.att.net (12.123.20.129) [AS 7018] 60 msec 60 msec 64 msec
ORD to smugmug:
traceroute to hera.smugmug.com (63.81.134.23), 64 hops max, 40 byte packets
1 2005host1 (12.160.3.1) 1.269 ms 0.500 ms 0.330 ms
2 12.160.1.193 (12.160.1.193) 1.359 ms 1.496 ms 1.599 ms
3 * * *
4 br2-a3120s10.attga.ip.att.net (12.123.20.246) 84.651 ms 85.410 ms 109.327 ms
5 tbr2-cl29.sl9mo.ip.att.net (12.122.10.138) 86.623 ms 86.170 ms 86.212 ms
6 tbr1-cl24.sl9mo.ip.att.net (12.122.9.141) 88.014 ms 88.247 ms 87.898 ms
7 tbr2-cl2.sffca.ip.att.net (12.122.10.42) 87.823 ms 88.248 ms 107.440 ms
8 12.123.12.61 (12.123.12.61) 85.472 ms 85.832 ms 85.468 ms
9 12.126.40.42 (12.126.40.42) 86.179 ms 87.085 ms 87.095 ms
10 pri.r1-ge0-2-eq-sj.smugmug.com (206.223.117.70) 85.155 ms 97.153 ms 85.663 ms
smugmug nickname: mpmcleod
http://www.michaelmcleod.com/
OK I did as you suggested (on my Mac) and although it means absolutely nothing to me I'm sure it makes sense to you!
Let me know the implications?
Here it is:
Last login: Sat Jun 3 18:57:51 on ttyp1
Welcome to Darwin!
simons-power-mac-g5:~ simon$ traceroute www.smugmug.com
traceroute to hera.smugmug.com (63.81.134.23), 64 hops max, 40 byte packets
1 192.168.1.1 (192.168.1.1) 0.948 ms 0.414 ms 0.389 ms
2 10.121.56.1 (10.121.56.1) 9.253 ms * *
3 * * *
4 * * *
5 * * *
6 * * *
7 * * *
8 * * *
9 * * *
10 * * *
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
31 * * *
32 * * *
33 * * *
34 * * *
35 * * *
36 * * *
37 * * *
38 * * *
39 * * *
40 * * *
41 * * *
42 * * *
43 * * *
44 * * *
45 * * *
46 * * *
47 * * *
48 * * *
49 * * *
50 * * *
51 * * *
52 * * *
53 * * *
54 * * *
55 * * *
56 * * *
57 * * *
58 * * *
59 * * *
60 * * *
61 * * *
62 * * *
63 * * *
64 * * *
simons-power-mac-g5:~ simon$
...pics..
a bell's distant ring is carried on the wind......
Tumble weed giving form to the whistling breeze.....
...pics..
OK so last time there was no comment you told me to ping you , or it, but still nada?
I'd appreciate someone at least saying 'yes we have no idea what that means-something-anything....' otherwise I don't know even if it has been read
...pics..
Essentially, everything looks fine other than the issues we've already identified recently (database, etc).
I also noticed that my route to smugmug.com looks different. A couple of weeks ago I used to take the route from New York to the west coast. Now it looks like this:
Routenverfolgung zu hera.smugmug.com [63.81.134.23] über maximal 30 Abschnitte:
1 2 ms 1 ms 1 ms PC9 [192.168.0.1]
2 * * * Zeitüberschreitung der Anforderung.
3 48 ms 75 ms 290 ms 217.0.64.214
4 140 ms 141 ms 139 ms was-e4.WAS.US.net.DTAG.DE [62.154.15.38]
5 184 ms 162 ms 287 ms so0-0-0-2488m.ar1.DCA3.gblx.net [208.51.74.17]
6 225 ms 232 ms 240 ms so5-0-0-2488M.ar1.SJC2.gblx.net [67.17.67.154]
7 222 ms 224 ms 330 ms pri.r1-ge0-2-eq-sj.smugmug.com [206.223.117.70]
8 250 ms 248 ms 258 ms hera.smugmug.com [63.81.134.23]
Ablaufverfolgung beendet.
Are these changes due to your effort to decentralize the site?
Thanks,
Sebastian
PS: I don't know why the first server of my ISP timed out. Guess that has got to do with my strange problem.
EDIT: Problem is the same on IE6 and FF. I'm using www.congster.de which should be powered by www.telekom.de - the biggest provider in Germany.
SmugMug Support Hero
Really good upload program for smugmug^^^
Tracing route to hera.smugmug.com [63.81.134.23]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.0.1
2 7 ms 8 ms 7 ms so-erx1.hel.elisa-laajakaista.fi [80.186.0.1]
3 5 ms 5 ms 5 ms jnx-xdsl2-ae0.hel.kolumbus.net [193.229.6.81]
4 5 ms 6 ms 5 ms ge-1-0-0.jnx-ka.hel.kolumbus.net [193.229.255.12]
5 12 ms 12 ms 12 ms so-1-0-0.kista10.sto.kolumbus.net [193.229.1.26]
6 230 ms 215 ms 199 ms kista1.sto.kolumbus.net [62.248.254.35]
7 106 ms 105 ms 105 ms telebw1-pos1.nyc.kolumbus.net [193.229.1.98]
8 116 ms 116 ms 117 ms nyiix.he.net [198.32.160.61]
9 183 ms 184 ms 183 ms pos0-1.gsr12416.sjc2.he.net [216.218.223.133]
10 183 ms 184 ms 183 ms pri.r1-ge0-2-eq-sj.smugmug.com [206.223.117.70]
11 182 ms 181 ms 181 ms hera.smugmug.com [63.81.134.23]
Trace complete.