Blogger: Eric Maiwald
Recently, I was directed to a blog entry on 10 fallicies of distributed computing. As I read them, I was reminded that we really do not understand the intricacies of things we take for granted. Think of electric power, for example. We plug things into the wall or flip a light switch and we expect the computer to turn on or the light bulb to light up. We never think about where the power comes from or how it gets to the plate in the wall. All we know is that the light comes on. Once a month, we have a connection to the utility that provides the electricity – we get a bill. Of course, with today’s Internet billing many of us don’t get the physical paper bill any longer – all we get is an email.
We forget about how the electricity gets to us until something happens and the power goes out. Power outages, especially big ones, make the news headlines. Most recently, a storm named Ike brought a severe outage to the Texas coast. Some friends of mine in the region were without power for days and given the impact of the storm, I would not be surprised if there are still many without electricity. Big storms (and some not so big) cause power outages all the time. We all hope that the outage is short and usually they are. But when they are long, power outages make the news.
To combat outages, utilities have multiple power generation stations and transmission lines. There are agreements between utilities to share power so that the power grid can remain stable. Customers take precautions as well. Data centers have backup power systems to power their servers and equipment. Individual homes also may have backup generators (if you get your water from a well, you understand why this is important!).
I think that we look at the Internet as another utility. We don’t really know how it works or how the packets get to their destinations but we know that we expect the little green light to come on when we plug the network cable into the wall (or listen to the airwaves for the tell-tale SSID). At a higher level, we want our email to reach the recipients and we want Yahoo! and Google to appear on our screens as soon as we fire up the browser.
Big outages on the Internet make news. Over the course of time, backhoes have made it into the news when they are used to cut cables. More recently, undersea cable breaks have caused connectivity problems to large portions of the world (some parts were down completely while others suffered reduced bandwidth). Of course, it is not always about a physical layer problem. Issues with routing (intentional or not) or DNS can cause problems just as severe.
Running the networks that make up the Internet costs money. The companies that run networks charge fees to their customers to make up the cost and (hopefully) to make a profit as well. Redundancy and failover devices and connectivity cost more money than networks without redundancy so it makes sense that a network company will provide only the amount of redundancy that is needed to meet their obligations to their customers. Of course, customers want to spend as little money as possible to get the network connectivity they need.
As more and more applications are moved off the endpoints and into the Internet cloud, connectivity becomes more and more important. Do we understand the risks that this poses? Does the enterprise have sufficient connectivity? Is it sufficiently redundant? Will the failover plans suffice for a short outage? What about for a long outage? A long time ago, I learned of a risk assessment that was done for a large company in Chicago. The company relied heavily on connectivity between its data centers (one in Chicago and one outside the city). To manage the risks of network availability, the company leased connections from two different network providers so that it would have a failover if one provider had issues. An assessment of the connectivity provided showed that the redundancy was partial at best. The reason was that the physical wires travelled the same path for much of the way through Chicago (including being in the same conduit across one bridge). So one traffic accident could take out both lines and leave the company without the needed connectivity.
If you rely heavily on something (like the Internet), make sure that you understand how that thing is provided to you. Don’t just assume that the network is redundant and that it always “just works.” Talk to your providers. Find out what they are offering and how they manage availability and reliability. Go one step further and learn the routing of the actual connections to the providers “cloud.” If you truly need the network to be a utility, make sure you have a backup plan if it turns out to be less reliable than you expect.