Had I gotten into this thread earlier yesterday while the excrement was still striking the oscillating air circulation device *waits while people translate the euphemism*, I would have replied sooner and with more force, but I'll be the calmer, gentler self since things have calmed down now.
I don't anymore, but I used to work on the escalations team in a call center for a major telecommunications corp. where people would call to get their phones fixed (still work for the company, but do a diff. job). When we would have a major outage, I used to do what Andy did here. I would tell people that we'd get it done as soon as we could but I didn't have an estimated time. I did it for the same reason Andy did. I didn't want to give an estimate only to have it get missed. I had that happen ALL the time.
The worst was after Hurricane Katrina. I would tell people in early September that the time they were giving us was early October and most people understood. Mother Nature had just kicked their ass and they realized it would be a while. In some cases, there were people who didn't get service back until February of '06 because the central office where the switching equipment was had been completely destroyed.
And yes, we did have the unreasonable people. The one I remember most was from Florida. I didn't get the call, but there was a man who called us from a pay phone in Miami the day Hurricane Wilma went by... and demanded that we get their phones on immediately.
I'm saying all of this as a way of telling Andy and the SmugMug crew, "You guys are doing just fine. I understand that excrement occurs and you do the best you can when it does." I just got my Pro account Saturday when my free trial expired and I'm convinced I made the right choice.
To the people who got all bent out of shape because Andy wouldn't tell you exactly when the service would be back up, chill out. If you think you're pissed now, think about how pissed we'd all be if he'd said it would be up at a certain time and it wasn't. And then it happened again. And again. Think about it this way. Would you rather they come out every 15 minutes and say, "Should just be a little longer..." or would you rather they just hunker down and "git-r-done"?
SmugMug rocks and I hope to be around for a long while.
Back before my grey hairs I was a programmer at Visa. The beeper went off when the Visa system responsible for moving the money every night wasn't working. When you are young programmer sitting there at 2 am, the system in metaphorical pieces all around you, and no idea what the problem is, and the operator has just gotten off the phone with the FED who are now willing to hold their window for 10 more minutes... well, you develop some sympathy for folks similarly situated.
So - all you folks at Smugmug who just spent a very long Sunday NOT with your families and under a lot of stress I have at least a tiny bit of humor for you. On the last "status updated" blog post, the automated “possibly related posts” is showing “Cow Calf: Calving Stages of Labor”. Apparently the item referenced discusses the problems caused when calves come out backwards.
Yeah, that crossed my mind, and I don't pretend to know how much business you pros get on weekends, since the internet is a 24/7 business.
If your car broke down, or your furnace exploded, leaving you unable to go to work on a particular day, are you going to demand a day's pay from Ford, or from your furnace manufacturer?
As unfortunate as it is, it is the way of the world (internet). For us to demand compensation is ludicrous. I'm not going to demand that God pay me for missing work because he snowed me in last December. *yes, I know that's extreme*
It's annoying that there are people that have an insane sense of entitlement about things. Yes it sucks, but the pros aren't the only ones with lost revenue. SM lost potential new customers. Amazon likely lost ad revenue/new customers. It's not like they made out like bandits and we (we being those that sell photos, me not included) got the shaft.
I'm sure you were just playing devil's advocate, but I still think my point stands.
I was just playing devil's advocate and since my business is so small I doubt anyone even tried to visit my site during the outage but if you think about a large business that pays another business money for a service that the first one relies on and the second has a problem and can't deliver, there almost always is a compensation.
I used to work with mainframe computers in the early 80s for a large payroll company. Guess who had to pay compensation to customers when payrolls were late or taxes weren't payed on time. We couldn't just tell people that we had computer problems and they had to just suck it up because things don't work at 100% all the time.
I'm not asking for any compensation for the outage, I'm just also trying to make a point.
Have a day!
Mike
0
BaldyRegistered Users, Super ModeratorsPosts: 2,853moderator
edited July 23, 2008
Hey everyone,
One of the worst things about an outage is the persons most important for getting the site back up are the ones we all most want to hear from on the forums. But they had to stay immersed in fixes and prevention.
I haven't had a chance to de-brief with Don since the outage as he's still chasing some issues and so am I, but I can give some initial thoughts. (I also didn't get a chance to read through this thread.)
First, it's definitely disturbing to be so dependent on something that can and will go down. We've long sought a good backup to Amazon, but so far they are the best in the storage business. We'll see what Microsoft, Google, Sun, etc., can come up with in the future because we know they want a piece of Amazon's business.
No excuses, an outage that long isn't acceptable and we need to prevent another, but we've long known how hard 100% uptime is. We've seen MySpace go down for a couple days, eBay go down despite IBM's redundant systems to back them up, Flickr lose their storage for 6-8 hours even though Yahoo provided it, etc. Up to this point, Amazon's S3 uptime had been amazing.
Nor can we pin all our systems issues on Amazon. I thought our growth was going to slow last spring due to the economy and I was busy managing SmugMug conservatively. None of us had any idea sales would catch fire as they did starting around March.
It appears that Flickr and Photobucket are weakening and we're seeing an influx of refugees, but I never saw it coming.
Nor did we have a clue that SmugShot would be listed as the #1 hot app on Apple's App Store during a week when they sold 1 million new phones, and it would cause SmugMug signups to go insane.
We'd see part of our network saturate, we'd upgrade it and then see a database server run hot. A lot of it happened the week leading up to Amazon's outage, maybe because the iPhone launch drove so much traffic.
We found bugs and performance issues as the systems were pressed that we hadn't caught before.
In any case, we're sorry and embarrassed and focused on performance. We're not spending our time and energy on marketing, but on our infrastructure, bugs, and features.
Comments
I don't anymore, but I used to work on the escalations team in a call center for a major telecommunications corp. where people would call to get their phones fixed (still work for the company, but do a diff. job). When we would have a major outage, I used to do what Andy did here. I would tell people that we'd get it done as soon as we could but I didn't have an estimated time. I did it for the same reason Andy did. I didn't want to give an estimate only to have it get missed. I had that happen ALL the time.
The worst was after Hurricane Katrina. I would tell people in early September that the time they were giving us was early October and most people understood. Mother Nature had just kicked their ass and they realized it would be a while. In some cases, there were people who didn't get service back until February of '06 because the central office where the switching equipment was had been completely destroyed.
And yes, we did have the unreasonable people. The one I remember most was from Florida. I didn't get the call, but there was a man who called us from a pay phone in Miami the day Hurricane Wilma went by... and demanded that we get their phones on immediately.
I'm saying all of this as a way of telling Andy and the SmugMug crew, "You guys are doing just fine. I understand that excrement occurs and you do the best you can when it does." I just got my Pro account Saturday when my free trial expired and I'm convinced I made the right choice.
To the people who got all bent out of shape because Andy wouldn't tell you exactly when the service would be back up, chill out. If you think you're pissed now, think about how pissed we'd all be if he'd said it would be up at a certain time and it wasn't. And then it happened again. And again. Think about it this way. Would you rather they come out every 15 minutes and say, "Should just be a little longer..." or would you rather they just hunker down and "git-r-done"?
SmugMug rocks and I hope to be around for a long while.
http://www.fountaincityphotography.com
Camera Gear: Canon 400D (XTi), 18-55 f/3.5-5.6, 75-300 f/4.0-5.6, 70-200 f/4 L, 50 f/1.8 II
Back before my grey hairs I was a programmer at Visa. The beeper went off when the Visa system responsible for moving the money every night wasn't working. When you are young programmer sitting there at 2 am, the system in metaphorical pieces all around you, and no idea what the problem is, and the operator has just gotten off the phone with the FED who are now willing to hold their window for 10 more minutes... well, you develop some sympathy for folks similarly situated.
So - all you folks at Smugmug who just spent a very long Sunday NOT with your families and under a lot of stress I have at least a tiny bit of humor for you. On the last "status updated" blog post, the automated “possibly related posts” is showing “Cow Calf: Calving Stages of Labor”. Apparently the item referenced discusses the problems caused when calves come out backwards.
Hope you all are getting some sleep.
http://news.cnet.com/8301-1001_3-9995937-92.html?part=rss&subj=news&tag=2547-1_3-0-5
Monte
Yes. I used the super-handy "Clear Cache Button" for Firefox:
https://addons.mozilla.org/en-US/firefox/addon/1801
Good report. Thanks for the link!
I used to work with mainframe computers in the early 80s for a large payroll company. Guess who had to pay compensation to customers when payrolls were late or taxes weren't payed on time. We couldn't just tell people that we had computer problems and they had to just suck it up because things don't work at 100% all the time.
I'm not asking for any compensation for the outage, I'm just also trying to make a point.
Have a day!
Mike
One of the worst things about an outage is the persons most important for getting the site back up are the ones we all most want to hear from on the forums. But they had to stay immersed in fixes and prevention.
I haven't had a chance to de-brief with Don since the outage as he's still chasing some issues and so am I, but I can give some initial thoughts. (I also didn't get a chance to read through this thread.)
First, it's definitely disturbing to be so dependent on something that can and will go down. We've long sought a good backup to Amazon, but so far they are the best in the storage business. We'll see what Microsoft, Google, Sun, etc., can come up with in the future because we know they want a piece of Amazon's business.
No excuses, an outage that long isn't acceptable and we need to prevent another, but we've long known how hard 100% uptime is. We've seen MySpace go down for a couple days, eBay go down despite IBM's redundant systems to back them up, Flickr lose their storage for 6-8 hours even though Yahoo provided it, etc. Up to this point, Amazon's S3 uptime had been amazing.
Nor can we pin all our systems issues on Amazon. I thought our growth was going to slow last spring due to the economy and I was busy managing SmugMug conservatively. None of us had any idea sales would catch fire as they did starting around March.
It appears that Flickr and Photobucket are weakening and we're seeing an influx of refugees, but I never saw it coming.
Nor did we have a clue that SmugShot would be listed as the #1 hot app on Apple's App Store during a week when they sold 1 million new phones, and it would cause SmugMug signups to go insane.
We'd see part of our network saturate, we'd upgrade it and then see a database server run hot. A lot of it happened the week leading up to Amazon's outage, maybe because the iPhone launch drove so much traffic.
We found bugs and performance issues as the systems were pressed that we hadn't caught before.
In any case, we're sorry and embarrassed and focused on performance. We're not spending our time and energy on marketing, but on our infrastructure, bugs, and features.
Thanks,
Baldy
It's great when you keep us informed....no matter what type of information you may need to pass along.
www.Dogdotsphotography.com
Worth a read: http://status.aws.amazon.com/s3-20080720.html
Thanks for posting this..it was a very interesting read
www.Dogdotsphotography.com
Google had a major issue as well with its cloud-computing system:
http://www.eweek.com/c/a/Messaging-and-Collaboration/Google-Gmail-Google-Apps-Suffer-Outage-in-The-Cloud/?kc=EWKNLEDP08082008D
When I hear the earth will melt into the sun,
in two billion years,
all I can think is:
"Will that be on a Monday?"
==========================
http://www.streetsofboston.com
http://blog.antonspaans.com