Options

11:49 ET - Is Smug MugDown? Outage Update

191011121315»

Comments

  • Options
    KCBearcatKCBearcat Registered Users Posts: 164 Major grins
    edited July 21, 2008
    Had I gotten into this thread earlier yesterday while the excrement was still striking the oscillating air circulation device *waits while people translate the euphemism*, I would have replied sooner and with more force, but I'll be the calmer, gentler self since things have calmed down now.

    I don't anymore, but I used to work on the escalations team in a call center for a major telecommunications corp. where people would call to get their phones fixed (still work for the company, but do a diff. job). When we would have a major outage, I used to do what Andy did here. I would tell people that we'd get it done as soon as we could but I didn't have an estimated time. I did it for the same reason Andy did. I didn't want to give an estimate only to have it get missed. I had that happen ALL the time.

    The worst was after Hurricane Katrina. I would tell people in early September that the time they were giving us was early October and most people understood. Mother Nature had just kicked their ass and they realized it would be a while. In some cases, there were people who didn't get service back until February of '06 because the central office where the switching equipment was had been completely destroyed.

    And yes, we did have the unreasonable people. The one I remember most was from Florida. I didn't get the call, but there was a man who called us from a pay phone in Miami the day Hurricane Wilma went by... and demanded that we get their phones on immediately. rolleyes1.gif

    I'm saying all of this as a way of telling Andy and the SmugMug crew, "You guys are doing just fine. I understand that excrement occurs and you do the best you can when it does." I just got my Pro account Saturday when my free trial expired and I'm convinced I made the right choice.

    To the people who got all bent out of shape because Andy wouldn't tell you exactly when the service would be back up, chill out. If you think you're pissed now, think about how pissed we'd all be if he'd said it would be up at a certain time and it wasn't. And then it happened again. And again. Think about it this way. Would you rather they come out every 15 minutes and say, "Should just be a little longer..." or would you rather they just hunker down and "git-r-done"?

    SmugMug rocks and I hope to be around for a long while. :)
    Alan H.
    http://www.fountaincityphotography.com
    Camera Gear: Canon 400D (XTi), 18-55 f/3.5-5.6, 75-300 f/4.0-5.6, 70-200 f/4 L, 50 f/1.8 II
  • Options
    suisekiartsuisekiart Registered Users Posts: 5 Big grins
    edited July 21, 2008
    What KCBearcat said...

    Back before my grey hairs I was a programmer at Visa. The beeper went off when the Visa system responsible for moving the money every night wasn't working. When you are young programmer sitting there at 2 am, the system in metaphorical pieces all around you, and no idea what the problem is, and the operator has just gotten off the phone with the FED who are now willing to hold their window for 10 more minutes... well, you develop some sympathy for folks similarly situated.

    So - all you folks at Smugmug who just spent a very long Sunday NOT with your families and under a lot of stress I have at least a tiny bit of humor for you. On the last "status updated" blog post, the automated “possibly related posts” is showing “Cow Calf: Calving Stages of Labor”. Apparently the item referenced discusses the problems caused when calves come out backwards.

    Hope you all are getting some sleep.
  • Options
    MontecMontec Registered Users Posts: 823 Major grins
    edited July 22, 2008
  • Options
    darryldarryl Registered Users Posts: 997 Major grins
    edited July 22, 2008
    Stupid question, but did you flush your cache before each test?

    Yes. I used the super-handy "Clear Cache Button" for Firefox:

    https://addons.mozilla.org/en-US/firefox/addon/1801
  • Options
    xrisxris Registered Users Posts: 546 Major grins
    edited July 22, 2008
    So That's What Happened...
    Montec wrote:
    Good report. Thanks for the link!
    thumb.gif
    X www.thepicturetaker.ca
  • Options
    mikegodwinmikegodwin Registered Users Posts: 4 Beginner grinner
    edited July 22, 2008
    JoeG wrote:
    Yeah, that crossed my mind, and I don't pretend to know how much business you pros get on weekends, since the internet is a 24/7 business.

    If your car broke down, or your furnace exploded, leaving you unable to go to work on a particular day, are you going to demand a day's pay from Ford, or from your furnace manufacturer?

    As unfortunate as it is, it is the way of the world (internet). For us to demand compensation is ludicrous. I'm not going to demand that God pay me for missing work because he snowed me in last December. *yes, I know that's extreme*

    It's annoying that there are people that have an insane sense of entitlement about things. Yes it sucks, but the pros aren't the only ones with lost revenue. SM lost potential new customers. Amazon likely lost ad revenue/new customers. It's not like they made out like bandits and we (we being those that sell photos, me not included) got the shaft.

    I'm sure you were just playing devil's advocate, but I still think my point stands.
    I was just playing devil's advocate and since my business is so small I doubt anyone even tried to visit my site during the outage but if you think about a large business that pays another business money for a service that the first one relies on and the second has a problem and can't deliver, there almost always is a compensation.
    I used to work with mainframe computers in the early 80s for a large payroll company. Guess who had to pay compensation to customers when payrolls were late or taxes weren't payed on time. We couldn't just tell people that we had computer problems and they had to just suck it up because things don't work at 100% all the time.
    I'm not asking for any compensation for the outage, I'm just also trying to make a point.

    Have a day!
    Mike
  • Options
    BaldyBaldy Registered Users, Super Moderators Posts: 2,853 moderator
    edited July 23, 2008
    Hey everyone,

    One of the worst things about an outage is the persons most important for getting the site back up are the ones we all most want to hear from on the forums. But they had to stay immersed in fixes and prevention.

    I haven't had a chance to de-brief with Don since the outage as he's still chasing some issues and so am I, but I can give some initial thoughts. (I also didn't get a chance to read through this thread.)

    First, it's definitely disturbing to be so dependent on something that can and will go down. We've long sought a good backup to Amazon, but so far they are the best in the storage business. We'll see what Microsoft, Google, Sun, etc., can come up with in the future because we know they want a piece of Amazon's business.

    No excuses, an outage that long isn't acceptable and we need to prevent another, but we've long known how hard 100% uptime is. We've seen MySpace go down for a couple days, eBay go down despite IBM's redundant systems to back them up, Flickr lose their storage for 6-8 hours even though Yahoo provided it, etc. Up to this point, Amazon's S3 uptime had been amazing.

    Nor can we pin all our systems issues on Amazon. I thought our growth was going to slow last spring due to the economy and I was busy managing SmugMug conservatively. None of us had any idea sales would catch fire as they did starting around March.

    337481963_ufFxS-L.png

    It appears that Flickr and Photobucket are weakening and we're seeing an influx of refugees, but I never saw it coming.

    Nor did we have a clue that SmugShot would be listed as the #1 hot app on Apple's App Store during a week when they sold 1 million new phones, and it would cause SmugMug signups to go insane.

    We'd see part of our network saturate, we'd upgrade it and then see a database server run hot. A lot of it happened the week leading up to Amazon's outage, maybe because the iPhone launch drove so much traffic.

    We found bugs and performance issues as the systems were pressed that we hadn't caught before.

    In any case, we're sorry and embarrassed and focused on performance. We're not spending our time and energy on marketing, but on our infrastructure, bugs, and features.

    Thanks,
    Baldy
  • Options
    DogdotsDogdots Registered Users Posts: 8,795 Major grins
    edited July 23, 2008
    Thank-you for posting this information. thumb.gif

    It's great when you keep us informed....no matter what type of information you may need to pass along.
  • Options
    xrisxris Registered Users Posts: 546 Major grins
    edited July 23, 2008
    Nice to know, Baldy. Thanks for taking time. It's appreciated.
    thumb.gif
    X www.thepicturetaker.ca
  • Options
    nairb774nairb774 Registered Users Posts: 19 Big grins
    edited July 27, 2008
    Amazon's post-mortem...
    Worth a read: http://status.aws.amazon.com/s3-20080720.html
  • Options
    DogdotsDogdots Registered Users Posts: 8,795 Major grins
    edited July 27, 2008
    nairb774 wrote:

    Thanks for posting this..it was a very interesting read thumb.gif
  • Options
    flyingdutchieflyingdutchie Registered Users Posts: 1,286 Major grins
    edited August 8, 2008
    'Cloud' computing has its problems...
    Google had a major issue as well with its cloud-computing system:
    http://www.eweek.com/c/a/Messaging-and-Collaboration/Google-Gmail-Google-Apps-Suffer-Outage-in-The-Cloud/?kc=EWKNLEDP08082008D
    I can't grasp the notion of time.

    When I hear the earth will melt into the sun,
    in two billion years,
    all I can think is:
        "Will that be on a Monday?"
    ==========================
    http://www.streetsofboston.com
    http://blog.antonspaans.com
Sign In or Register to comment.