Options

trouble accessing home page and sluggish

2»

Comments

  • Options
    havanesehavanese Registered Users Posts: 197 Major grins
    edited September 5, 2009
    docwalker wrote:
    Randy, wow... take a look at the spike at hop 9 in your traceroute. That is not good.

    Be sure to run that linequality test for us. That wll also help figure out what is going on.

    --Doc

    Here is the line test.

    http://www.dslreports.com/linequality/nil/2561903
    Randy P.
    Fuji X shooter
    Thoughts and Images
  • Options
    docwalkerdocwalker Registered Users Posts: 1,867 SmugMug Employee
    edited September 5, 2009
    There is some data loss at your IP that you need to check. See the 2% loss. Check your connections and possible restart the Modem/Router. That may help. You want as clean of a line as possible when dealing with Photos and Videos.

    Let me know what you get for the trace to your custom domain.
    SmugMug Support Hero
    http://help.smugmug.com
  • Options
    havanesehavanese Registered Users Posts: 197 Major grins
    edited September 5, 2009
    Had some of my Facebook and Friendfeed folks try and they are able to access our site, so it's beginning to look like an issue with our local ISP.

    Not the first time they have had issues with their routers (fairly new fiber to premises company)
    Randy P.
    Fuji X shooter
    Thoughts and Images
  • Options
    docwalkerdocwalker Registered Users Posts: 1,867 SmugMug Employee
    edited September 5, 2009
    That is entirely possible as it is loading fine for me here from my home in Virginia. thumb.gif
    SmugMug Support Hero
    http://help.smugmug.com
  • Options
    havanesehavanese Registered Users Posts: 197 Major grins
    edited September 5, 2009
    docwalker wrote:
    That is entirely possible as it is loading fine for me here from my home in Virginia. <img src="https://us.v-cdn.net/6029383/emoji/thumb.gif&quot; border="0" alt="" >

    Tracing route to a539.b.akamai.net [64.208.176.11]
    over a maximum of 30 hops:

    1 <1 ms <1 ms <1 ms 192.168.1.1
    2 3 ms 2 ms 1 ms 69.196.212.1
    3 1 ms 1 ms 2 ms 69.196.208.1
    4 7 ms 8 ms 9 ms core-01-g4-22-111.lsvokyqy.cinergycom.net
    5 9 ms 278 ms 300 ms lsvokyqyt-irt-01-G0.cinergycom.net
    6 12 ms 12 ms 13 ms cer-edge-14.inet.qwest.net [65.117.139.61]
    7 14 ms 13 ms 13 ms cer-core-02.inet.qwest.net [205.171.139.5]
    8 * * 30 ms chp-brdr-03.inet.qwest.net [67.14.8.190]
    9 113 ms 226 ms 248 ms te8-1-10G.ar2.CHI2.gblx.net
    10 26 ms 25 ms 24 ms te1-1-10G.ar4.CHI2.gblx.net
    11 15 ms 13 ms 14 ms 64-208-176-11.nas1.mon.ny.frontiernet.net
    Trace complete.
    Randy P.
    Fuji X shooter
    Thoughts and Images
  • Options
    docwalkerdocwalker Registered Users Posts: 1,867 SmugMug Employee
    edited September 5, 2009
    Check 5 and 9 now. Those are nasty. The numbers should start at 1 and increase line by line without going up and down like yours are doing. When you see that, it indicates a problem. So some of it is at cindergycom and globalcrossing.
    SmugMug Support Hero
    http://help.smugmug.com
  • Options
    havanesehavanese Registered Users Posts: 197 Major grins
    edited September 5, 2009
    docwalker wrote:
    Check 5 and 9 now. Those are nasty. The numbers should start at 1 and increase line by line without going up and down like yours are doing. When you see that, it indicates a problem. So some of it is at cindergycom and globalcrossing.

    Yeah working with local ISP, clueless guy on phone said he wasn't suppose to page his supervisor until next Tuesday unless it was an emergency.

    Laughing.gif
    Randy P.
    Fuji X shooter
    Thoughts and Images
  • Options
    docwalkerdocwalker Registered Users Posts: 1,867 SmugMug Employee
    edited September 5, 2009
    Hahahaha, you have got to be kidding me. I don't want to bug Don (our CEO) on a holiday weekend. But I know that one text message or a phone call would have him or any of our ops team in a matter of minutes. rolleyes1.gif
    SmugMug Support Hero
    http://help.smugmug.com
  • Options
    havanesehavanese Registered Users Posts: 197 Major grins
    edited September 5, 2009
    docwalker wrote:
    Hahahaha, you have got to be kidding me. I don't want to bug Don (our CEO) on a holiday weekend. But I know that one text message or a phone call would have him or any of our ops team in a matter of minutes. rolleyes1.gif

    Last question

    I can seem to access a lot of other smugmug users photos, just not my own. Is that because their data is housed at another location (IPs)?
    Randy P.
    Fuji X shooter
    Thoughts and Images
  • Options
    AndyAndy Registered Users Posts: 50,016 Major grins
    edited September 5, 2009
    havanese wrote:
    Last question

    I can seem to access a lot of other smugmug users photos, just not my own. Is that because their data is housed at another location (IPs)?
    No.
    Work with Doc, he's emailing you from the desk.
  • Options
    nullroutennullrouten Registered Users Posts: 9 Beginner grinner
    edited September 6, 2009
    routers 101
    docwalker wrote:
    Randy, wow... take a look at the spike at hop 9 in your traceroute. That is not good.

    Be sure to run that linequality test for us. That wll also help figure out what is going on.

    --Doc


    Doc, I feel the need to clarify traceroute a bit. While this user *may* have had a connection issue... both the traceroutes looked fine, and I'll explain why....

    Traceroute sends packets to a target IP with an incrementing TTL (time to live) which really isn't time-based at all. TTL is a hop-counter, and prevents packets from looping endlessly. When a router receives a packet for forwarding, it decrements the TTL by 1 and forwards it on. This process happens automatically in hardware. When a packet is received with a TTL of 1, the router sets it to 0 and generated an ICMP-ttl-exceeded message back to the sender. When you run traceroute, first you send a packet with TTL of 1... the first hop decrements to 0, generates an ICMP message... and thats how you know what hop1 is. Then your computer sends a packet with TTL of 2... and the second hop must respond, etc....

    As such, a packet from computer A to computer B is flowing through any number of hops, and if a packet is fast/clean to the end node (or any later node) then clearly the intermediate hops are okay, because otherwise everything past the "bad hop" would be "bad". If one hop appears bad, but the hops after it appear fine... its unlikely that appearance is indicative of a real problem. Packets must flow thru it to get past it.

    Large routers have *different* components dealing with packets to and through the router. For a computer, its all the same... packets enter the NIC and hit the CPU. For a router... if a packet is flowing through, there is a matrix of fast hardware doing lookups in ternary cam (or sram) (the packets destination address is automatically wired to the egress queue). If a packet needs to be responded to (which happens when the router needs to expire the TTL), the packet must head off the highway into a side road, up to the CPU, the CPU thinks about it, and responds. For large internet class routers we're talking per-linecard backplanes of ~40gbps and the lane to the CPU is often 100mbps or 1gbps (a factor of 40 - 400 times slower). Also, the control-plane of all the carrier-class routers are less powerful than my laptop... because they are only involved for control messages and forward table setup.

    What you saw in both of the traceroutes (the two where you said "whoa look at hop X!") was one "bad" hop followed by no problems... which simply means that router was having trouble generating TTL-expired messages, but clearly not having problems transiting packets, as all following hops were fine. Since you're looking at a pipeline, *if* there is a clog... it will affect everything behind it for that given source/dest IP pair.

    Here's a fabricated example of a false positive:

    hop time
    1 10
    2 11
    3 11
    4 567
    5 14
    6 14
    7 end

    Here's a fabricated example of a true problem:
    hop time
    1 10
    2 11
    3 11
    4 567
    5 487
    6 580
    7 end


    Now, I wont even get into ECMP (equal cost multi path) hashing and the QOS (quality of service) tricks providers play to prioritize ICMP while normal TCP and UDP are in the crapper... there are multiple reasons to do it, some good, some bad. The nickel summary is... ICMP traceroute is not always accurate in determining problems... as they might be artificially prioritized when your actual packets are getting dropped. Tools such as lft (Layer Four Traceroute), and tcptraceroute are good tools that mimic normal traffic (you can send packets on TCP/80 and the routers think you're talking to a web server).

    One thing is certain: if you see results like my fabricated "true problem" above... under icmp prioritization or not, it is unlikely that its false positive and very likely to be a real problem.

    One other false positive that people find is where the traceroute turns to stars at a given hop, and continues to star-out all the way to the end. People claim this is a clear problem. If you are able to hit a webpage at all on ip x.x.x.x and then trace to x.x.x.x and see this... its not showing that all connectivity is lost... its just showing that traceroute is filtered at some point past the last hop where you see an IP... because if all connectivity was lost, how could you even load any page (broken or not)?

    Hope this helps.

    Sincerely,
    A router geek.
    (nullrouten)
  • Options
    docwalkerdocwalker Registered Users Posts: 1,867 SmugMug Employee
    edited September 6, 2009
    Router Geek,

    Thanks for the great information. I want to clarify a bit. I never claim to be an expert. I just know what I see. You say that one spike is not indicative of a problem. But, I know that when I have trouble with my Comcast connection and see that exact symptom, they are related. Repeated test results over months of testing have shown this to be true for me. I don't run just pings, and traceroutes. I also run other network monitoring tools. One of our Sorcerers lives a couple hundred miles from me but also uses Comcast and gets similar results. He and I compare trace results when our ISP starts acting up.

    A point that everyone has to remember is that these tests are snap shots of time. A connection problem may happen at random intervals and one set of tests may not capture the entire problem.

    One user reporting a problem is often a connection problem on their end. A problem on our end usually effects thousands at once or more depending on if Akamai is involved.. When a customer contacts us like this, the support heroes will often ask other heroes to try the site. Because the heroes work from home and use standard internet connections just like our customers, we spot SmugMug connection related problems very fast. Every connection we use is different so we get an excellent sample.

    Some folks just assume that when they are connected to the internet, they should be able to get to everything. "I can get to Youtube, you must be down". "Facebook is fast. Why are you slow?" Just trying to explain how routes work is a pain, then you have to also throw in that we use Akamai to help speed up your sites. :-)

    The bigger issue is that unless the problem is at your local router, or on SmugMugs end there is very little either of us can do. When we do see an issue on our end, the Heroes contact our ops guys and they take care of contacting SmugMugs ISP's if needed. If its on the customers end we usually recommend restarting the router/modem, checking the cables, and if that does not help contacting the ISP. For uploading we also recommend not using wireless when possible as interference can be a problem. Other than that, our hands are tied. Usually user ISP problems resolve themselves.

    Not saying that you are wrong at all. What you say makes perfect sense and is very close to what I know from my long and winding computer/network career. But I have to rely on these tests as there is not much else we can ask our customers to do. Asking some folks to download (with a crappy connection), install, and run advanced network monitoring tools would be a nightmare.

    I really do appreciate your info. I am working on some training stuff for the other heroes and it will help.

    --Doc
    SmugMug Support Hero
    http://help.smugmug.com
Sign In or Register to comment.