Blogs

Why the City of Seattle’s Data Center Outage Last Weekend Matters

There are a lot of things that the City of Seattle did right in their management of last weekend’s data center outage to fix a faulty electrical bus bar. They kept their customers – the public – well informed as to the outage schedule, the applications affected, and the applications that were still online. Their critical facilities team completed the maintenance earlier than expected, and they kept their critical applications online (911 services, traffic signals, etc) throughout the maintenance period.

Seattle’s mayor, Mike McGinn, acknowledged in a press conference last week on the outage that the city’s current data center facility “is not the most reliable way for us to support the city’s operations.” Are you looking for a data center provider, especially one where you’ll never have to go on record with that statement? If so, here are a few take-aways:

A failure to plan is a plan to fail. While the city of Seattle planned to keep their emergency and safety services online, had they truly planned for the worst? I’m sure they had a backup plan if the maintenance took a turn for the worse, but did they consider the following: what if a second equipment fault occurs? Traditionally, the “uptime” of an application is defined as the amount of time that there is a live power feed provided to the equipment running that application. I would offer a new definition of “uptime” for mission critical applications: the time during which both a live power feed and an online, ready-to-failover redundant source of power is available to ensure zero interruptions. “Maintenance window” shouldn’t be part of your mission critical vocabulary. Which brings me to my next point . . .

Concurrent maintainability and infrastructure redundancy is key. I will go one step further – concurrent maintainability AND fault tolerance are key factors in keeping your IT applications online. The requirement to perform maintenance and sustain an equipment fault at the same time isn’t paranoia – it’s sound planning. Besides, a little paranoia is a good thing when we’re talking about applications like 911 services, payment processing applications, or other business-critical applications.

Location.  Location.  Location. The city of Seattle’s data center is located on the 26th story of a towering office building in downtown Seattle. The fact that they had to take down multiple applications in order to perform this maintenance implies that the electrical feed redundancy to their data center is somewhat limited. There are many competing factors in choosing data center location: risk profile, electrical feed, connectivity options, and natural hazard risk profile, to name a few. For mission critical applications, your location choice has to center on factors that will keep your systems online 100% of the time.

Flexibility and scalability give your IT department room to breathe. The city of Seattle leased out their single-floor data center space before the information economy really took hold. As a result, their solution is relatively inflexible when it comes to the allowable power density of their equipment. They’re quickly outgrowing their space and already looking for an alternate solution. Look for a data center provider that focuses on planning for high-paced increases in rack power draw – do they already have a plan for future cooling capacity? How much power has the facility contracted from the local utility?

Why Send Your Staff to the Data Center to Rack Just One Box?

It's 2a.m. Do you know who is working on your servers?

This week RagingWire announced our new Unlimited Remote Hands and Eyes service, and I couldn’t be happier.  Prior to joining RagingWire, I was a RagingWire customer for over 10 years in addition to working with most of the other major Northern California Data Centers. One of my pet peeves was always the remote hands offerings. They stunk… I could never properly budget for them and I never received consistent service delivery. At most data centers, including RagingWire, remote hands was a time and materials service (T&M) - Every time I called, the clock started ticking. At least RagingWire staffed skilled IT workers 24x7. Too often, the other data centers used security guards to provide their remote hands service.

I believe the combination of T&M billing and inconsistent service causes individuals to make bad decisions. A hard drive fails and a decision is made to roll the dice and wait until you can send someone to the data center to replace it. Or even worse, companies let the proximity of a facility become a primary decision criteria for your data center selection  because you know that occasionally someone needs to touch your equipment and you don’t trust the guys on the other end of the phone.

Help is here. RagingWire’s new service helps address these problems, and more. What we’ve done is create a service that allows for an unlimited number of Remote Hands and Eyes support requests for a fixed monthly fee –with a guaranteed response SLA. We fulfill this service with skilled technicians in our California and Virginia NOCs that are staffed 24x7, not with security guards (I have nothing against security guards, however they should be providing security services, not IT services). Because our service is unlimited, you don’t need to wait until the morning to get someone to move a cable, cycle a server, or swap a tape. Additionally, the clock doesn’t start ticking when you call -- your fee is fixed which makes it easy for budgeting and planning.

So what else is covered by this service? Visual equipment checks, loading media, and incremental changes such as adding a new server or switch – plus more. Why send your staff to the data center for half a day when all you need is one box racked? Ship it to us with instructions. We’ll rack it, cable it, document it, and let you know when it’s ready for use. This really is a great new service and it’s priced low enough to make the ROI compelling for customers of every size.

If you’d like more information, talk to your account rep. I hear we’re giving away a free iPad to one lucky customer who inquires about this service. That could be you!

A New Era in Data Centers Begins in Ashburn, Virginia!

Why would 400 business and government leaders, including 65 CEOs/presidents and 20 elected officials, come to a data center opening? Because over the last decade, data centers have become the critical links in the internet and information technology supply chain. And on July 31 in Ashburn, Virginia, RagingWire opened the doors on a new era of data centers. 

This 150,000 square foot facility is unique in that it is designed for both form and function. Functionally, it is everything you would expect from a ultra-high availability data center – high density power, direct fiber connectivity, advanced power and cooling systems, multiple layers of infrastructure redundancy, end to end monitoring, automated failover, defense in depth security, etc. In addition, the Ashburn data center also introduces a new form factor and aesthetic that has never been seen before – it’s a comfortable work environment for IT professionals. 

Most systems and application work – testing, upgrades, maintenance – occurs in off-hours so as not to impact ongoing users and operations. That means IT professionals are working evening and week end shifts at the data center. In RagingWire’s new Ashburn data center campus, we provide customer-dedicated conference rooms, flexible office space, drop-in workstations, complimentary coffee and soft drinks, upscale vending machines, and two client lounges so that you can feel energized and refreshed at work. We even have locker rooms and showers and the facility is next to the 45-mile long Washington & Old Dominion Trail.

What was the ribbon cutting reception like? If you were there, you experienced the excitement. If you missed it -- I’m sorry! 

Gov. McDonnell cut the ribbon on RagingWire's Virginia Data Center

The Publisher of the Washington Business Journal, Alex Orfinger, was a wonderful Master of Ceremonies. Virginia Governor Bob McDonnell cut the ribbon along with RagingWire CEO George Macricostas and Vice Chairman Deno Macricostas. In his speech, the Governor recognized RagingWire’s technical innovation and business leadership. There was an exceptional stage show from Cirque Vertigo. We featured the very best local Virginia food and wine.  It was a wonderful evening!

One of the attendees sent us an email after the event saying, “RagingWire has built the sexiest and greenest data center in the industry!”  We agree!

We posted the Governor’s speech on our website.

And we got some good press coverage.

RagingWire is thrilled to be part of the high tech community in Virginia and Loudoun County. A new era in data centers is beginning in Ashburn, VA!

Three More Data Center Myths

SLAs are an "important" predictor of future performance: This is one of the secrets of the entire IT services industry. People hate it when I say this, but its true: SLAs are all a scam. Enterprise decision makers especially hate it, because they predicate their reliability decisions on SLAs. Here’s an experiment - ask your provider if they’ll give you a better SLA. They will agree - trust me. Then, ask them what they changed to make that better SLA happen. Guess what? They didn’t change anything.

Don’t despair, however -  There is a solution. The best predictor of future performance is not an SLA, its PAST PERFORMANCE. In other words, ask for the last five years of datacenter and network downtime, in minutes, across all facilities. If they say that they don’t record the data, run away - fast.

Design is the the "most" significant determiner of reliability: Don’t get me wrong - design is important. However, there are very few data centers with innovative designs, outside of increased efficiency. The move towards PODs, pre-fab, and containerization has undermined efforts towards innovative designs. At this point, there are exactly two providers with an innovative design focused on reliability.

So, what is the most significant determiner of data center reliability, right now? The data center’s critical facilities engineering and operations groups. Wait, your data center DOES have one of those, doesn’t it? You shouldn’t assume - these functions are often outsourced - poorly. Assuming your provider performs these functions in house, look for the following yardsticks: degreed engineers in management positions, master electricians, navy nuclear-trained technicians, and technicians who have worked for the major vendors (GE, Schneider, Emerson, York, etc) utilized in that facility. Also, ask for preventative maintenance records - if they can’t produce their PM history and schedule, run away.

"Multiple Power Grids" - Some data centers brag that they are on multiple power grids. The problem is that it is never true. In the US, there are only three power grids - East, West, and Texas. Unless your datacenter is in lovely downtown Texarkana, its on one grid. Sometimes, operators mean that they have feeds from multiple substations, but the same grid. This can be factually accurate, but is rarely effective - 90% of the outages that would bring down one substation would bring down an adjacent substation. Data centers must always assume they will lose utility power and operate accordingly. There is a certain level of hardening that is appropriate for connectivity to your power utility, but additional layers of connectivity beyond that point rarely bring additional uptime or reliability. Look for underground conduits and use of primary transmission voltage (34.5 or 69 kVA).

Data Center Site Selection – Low Risk vs. Close Proximity

Location, Location, Location

When choosing a data center, location counts. Location is often the first criteria discussed when evaluating a new data center partner. But too often, the driver behind location is proximity.  IT professionals like to be close to their servers. As much as we like to talk about operating a lights-out facility, there is always the comfort factor of being able to drive to the data center if there’s a problem.

Instead of proximity, risk should be the determining factor in choosing a data center. Risk of natural and manmade disasters that could affect the availability or performance of the systems you are housing. Intuitively we all understand this, but too often we blind ourselves to the risks that are right in front of us.

For example, most people would consider earthquakes to be the number one natural disaster risk in California. The risk of another Loma Prieta earthquake (6.9 on the richter scale) occurring in the Bay Area in the next 50 years is between 50% - 80% depending on which city you are looking at. Loma Prieta took down part of the Bay Bridge and the freeway system, and was a huge deal back in 1989. But look at a USGS map of earthquake probabilities, against a map of data centers in the Bay Area.

Data Center Selection

Most of the data center space in the Bay Area is in the worst place possible from an earthquake perspective. Not only is the probability high of another large quake, but the land the data centers sit on is subject to liquefaction. Basically, the sand underneath the buildings acts like a fluid during a quake. Even if the data center itself survives, will the surrounding electrical grid, water supply, roads, fuel vendors, etc. continue to function?

Earthquakes are just one of the many risks that data centers must deal with. When choosing a data center, looking at all the risks associated with the facility is key. The Uptime Institute has published a helpful guide to the natural disaster risk profiles for data center locations. A detailed discussion of the variety of risk factors data centers face is available at datacenterlinks.com.RagingWire’s facilities in Sacramento, California and Ashburn, Virginia are both in locations with low composite risk scores according to the Uptime Institute. For example, while the probability of a Loma Prieta sized quake hitting Bay Area data centers is higher than 50%, the probability of a quake of that size hitting RagingWire in Sacramento is less than .03%.

Risk, not proximity, should be the driving factor in your data center location criteria. RagingWire has a variety of innovative services, like our unlimited remote hands service, that can less the need for proximity to be a determining factor. RagingWire lets you choose the best data center, not just the best data center within driving distance of your office.

Three Data Center Myths

"Raised Floor vs Slab Floor" - This is a religious dispute, masquerading as a serious engineering issue. There are two sorts of folks who obsess over this point - design engineers with a very narrow worldview, and marketing executives with overactive imaginations and a casual relationship with the facts. Raised floor works great. Its more flexible for operators. Slab works great, too - it has other advantages, including reduced cost, and no heat transfer media running under the data center floor. You can build a well designed data center, with either floor configuration.

VESDA Fire Detection is "Superior" - This is taken as revealed writ by many folks, because there is one obvious fact: VESDA does an incredible job of detecting fires earlier than any other technology. However, data centers very rarely catch fire, and there are many things that will generate false positives. The most important aspect of data center fire safety is reducing false positives and unintended fire suppression, both of which result in Bad Things. VESDA works, if you are careful and can minimize false positives. Laser smoke detection works well too, but you also have to carefully limit anything that could cause a false positive.

Dual Action Drypipe fire suppression, however, is the most important fire safety technology you can have in your data center. It should be really hard to douse your servers with water. Make sure you have thermal imagers for finding hot spots on wiring and a great relationship with your local fire department.

Physical Security "Doesn’t" Matter: Customers spend a lot of effort trying to choose a data center, but sometimes don’t carefully study security issues. That’s a huge mistake. Data Centers have been and are actively targeted for theft. Certain data centers (by no means, all) are very real terrorism targets. Three factor security with double man traps (for both data center and raised floor entry) are absolutely necessary. Also, look at the security staff - can you socially engineer them? At least one major (large) data center theft occurred due to successful social engineering.

Blog Tags:

Pages

Subscribe to RSS - blogs