Infrastructure

Why the City of Seattle’s Data Center Outage Last Weekend Matters

There are a lot of things that the City of Seattle did right in their management of last weekend’s data center outage to fix a faulty electrical bus bar. They kept their customers – the public – well informed as to the outage schedule, the applications affected, and the applications that were still online. Their critical facilities team completed the maintenance earlier than expected, and they kept their critical applications online (911 services, traffic signals, etc) throughout the maintenance period.

Seattle’s mayor, Mike McGinn, acknowledged in a press conference last week on the outage that the city’s current data center facility “is not the most reliable way for us to support the city’s operations.” Are you looking for a data center provider, especially one where you’ll never have to go on record with that statement? If so, here are a few take-aways:

A failure to plan is a plan to fail. While the city of Seattle planned to keep their emergency and safety services online, had they truly planned for the worst? I’m sure they had a backup plan if the maintenance took a turn for the worse, but did they consider the following: what if a second equipment fault occurs? Traditionally, the “uptime” of an application is defined as the amount of time that there is a live power feed provided to the equipment running that application. I would offer a new definition of “uptime” for mission critical applications: the time during which both a live power feed and an online, ready-to-failover redundant source of power is available to ensure zero interruptions. “Maintenance window” shouldn’t be part of your mission critical vocabulary. Which brings me to my next point . . .

Concurrent maintainability and infrastructure redundancy is key. I will go one step further – concurrent maintainability AND fault tolerance are key factors in keeping your IT applications online. The requirement to perform maintenance and sustain an equipment fault at the same time isn’t paranoia – it’s sound planning. Besides, a little paranoia is a good thing when we’re talking about applications like 911 services, payment processing applications, or other business-critical applications.

Location.  Location.  Location. The city of Seattle’s data center is located on the 26th story of a towering office building in downtown Seattle. The fact that they had to take down multiple applications in order to perform this maintenance implies that the electrical feed redundancy to their data center is somewhat limited. There are many competing factors in choosing data center location: risk profile, electrical feed, connectivity options, and natural hazard risk profile, to name a few. For mission critical applications, your location choice has to center on factors that will keep your systems online 100% of the time.

Flexibility and scalability give your IT department room to breathe. The city of Seattle leased out their single-floor data center space before the information economy really took hold. As a result, their solution is relatively inflexible when it comes to the allowable power density of their equipment. They’re quickly outgrowing their space and already looking for an alternate solution. Look for a data center provider that focuses on planning for high-paced increases in rack power draw – do they already have a plan for future cooling capacity? How much power has the facility contracted from the local utility?

Technology is great, but it’s all about the people.

Often, when we take potential customers through our data centers and show them our patented technology, they remark at what incredible technology we have designed and implemented. My first response is always this: it is a result of the people we hire to design, build, and operate our data centers. My two priorities in anything we do are availability of the customer application and outstanding customer service. These are enabled by technology, but driven by people. As demonstrated by numerous studies in the data center industry and from my previous life in the nuclear industry, people remain and are still the leading cause of downtime in data centers (more on that in follow-on posts).

First, hire the right people and then give them the tools to succeed. One of the best things RagingWire has done is give our employees and our clients a clear definition of our data center design: "Fix one, break one, concurrent with a utility outage." In other words: we are designed for concurrent maintainability and fault tolerance during a utility outage, including power or water -- this philosophy resonates through RagingWire's design, construction, and operations groups, and even in our concurrent engineering sessions with our clients. The philosophy is driven by the people we have at RagingWire.

Many people in the industry have tried to treat the data center as commoditized real estate. It is unequivocally not real estate; it is a product which at the end of the day delivers availability of a service and an application to our customers. As people try to commoditize and treat data centers as real estate, they lose focus on availability and product delivery and therefore they outsource design, construction, and operations - driving down service and quality. Data centers, the product that we provide, and the availability of the service is not a commodity that can easily be white washed between providers. There is an amazing amount of technology and innovation being put into our data centers and the product is backed up by incredible people dedicated to the availability and uptime of that product.

RagingWire has made a conscious decision to hire and in-source the life cycle of the data center. We design what we build, we build what we design and we operate what we design and build. And we provide these resources to our customers to ensure that when they build out, their IT environment is as hardened and redundant as possible and that their hardware, network and application level architecture is designed in conjunction with our data center design. The people, enabled by the technology, are the cornerstone of how we accomplish this with our clients and provide 100% availability of their applications and services.

Whenever we search for potential technology vendors, RagingWire always interviews the provider’s team and makes an evaluation of the people behind the product. You can take the greatest technology in the world, place it in the wrong hands and end up with a product that no one wants. Similarly, the right people can make all of the difference, especially when given incredible technology and tools.

The next time you go to your data center, evaluate the technology, how they do business, and their availability record. Just as important, evaluate who is behind the product and the people that are ultimately going to be ensuring your critical application availability.

The Power is Out – For 24,000 Others

Mylar BalloonOn Thursday, September 12, a power substation was ‘attacked’ by a mylar balloon and sent 24,000 homes and businesses into darkness in Sacramento. Who knows where the balloon came from or how long it has been floating in the sky before gracefully landing onto a power station and bringing most area businesses to a grinding halt. Most, except RagingWire.

I was in another RagingWire office building two blocks from our Sacramento data center campus when the utility power disruption occurred. Thankfully, with emergency lighting and an excellent iPhone flashlight app, the darkness was minimal. Upon walking to the parking lot in front of my office, I could hear the roar of RagingWire’s diesel generators across the business park. I wanted to experience operations at California’s largest multi-tenant data center during a utility power disruption, so I walked over to RagingWire’s data center facility. I didn’t know what to expect. People running around?  Worry? Stress?

I approached the building from the back, heading straight toward one of three massive generator rooms. I have not seen the generators ‘on’ yet, even though we run them every month for testing. Needless to say, it was loud, very loud, but the roar coming from the large Cummins engines was reassuring.

I entered the data center and walked throughout the facility – nothing was different. The humm of servers in customer cages could still be heard over the roar of the generators coming from the back of the building. I wondered if the customers working in their suites even knew the ‘outside world’ was without power, or if they even cared. It was calm on the data floor, calm in the power rooms, and calm in the NOC – a true testament to the architecture of this facility.

A day after the 11th anniversary of the attacks we suffered on 9/11, I believe we always think about the ‘big events’ that turn the power off or cause great damage to an area’s infrastructure. No matter how big or small an event, losing power to your company’s IT infrastructure can be devastating. Today it was a $6 mylar balloon that put a lot of businesses into the dark – except those who rely on RagingWire for their data center needs.

The Power of N

If you have been in or around data centers over the last 10 years, you have experienced the power of N. This single letter drives the architectural standards and design philosophies of the entire data center industry. There a lot of N’s in the data center industry -- N, 2N, N+1, N+2, and (2(N+1)).

Now RagingWire is introducing a new N called 2N+2. Why are we doing this? Well the other N’s didn’t measure up to the task of describing our patented critical infrastructure architecture.

What is N?
N is the amount of something you need in order to deliver a service or load. For an IT shop, N could be the number of servers you need to deliver a defined processing capacity. In a data center, N could be the number of UPSs (uninterruptible power supplies), generators, MSB (main switchboards) you need to deliver a power load. Of course in an N configuration, you need to hold constant the capacity of each element that makes up the N.

With N as your base, the next step is to identify the number of spare devices and complete backup units in your configuration. For example, let’s say you need 10 servers to run a cloud application. If you have a total of 14 interconnected servers with 10 production devices and four spare units, then you have an N+4 design. If you have two independent configurations of 10 servers each that can back each other up, then you have a 2N design.

N is a useful approach when describing the world of physical devices to deliver a certain capacity. The challenge is information technology and data centers are becoming increasing virtualized where pools of capacity are available and dynamically configurable. Devices still matter, but so does continuous monitoring and dynamic management of the capacity those devices deliver.

2N+2 Delivers 100% Availability
RagingWire’s patented 2N+2 design describes the physical devices, virtualized capacity, and PLCs (programmable logic controls) that enable us to deliver a data center with 100% availability even during maintenance periods and a utility outage.

We call the PLCs and integrated data center infrastructure management (DCIM) system, N-MatrixTM. With N-Matrix, we can combine our 2N power paths and N+2 critical infrastructure to deliver a 2N+2 data center – the most reliable data center design in the world.

When N+1 just isn’t good enough

2006 was a pivotal year for RagingWire. 2006 was the year RagingWire learned that for data centers, N+1 just isn't good enough. 2006 is the year RagingWire went dark. It started normally enough – a beautiful spring day in April. During normal operations, a 4,000Amp breaker failed. Material failures happen, even with the best maintenance programs in place. Our UPS's took the load while the generators started – then the generators overloaded. The data center went dark.

After bringing the data center back online, we performed a detailed post-mortem review and identified the root causes of the outage to be design flaws and human error. Our management team declared that this could never, ever happen again. We knew that we needed to invest heavily in our people, and that we needed to rethink how data centers operate. We started with investing in our people because human error can overwhelm even the best of infrastructure designs. We focused our recruitment efforts in the nuclear energy industry and the navy nuclear engineering program – both working environments where downtime is not an option and process control, including operations and maintenance, is second nature. We hired a talented team and asked them to design and operate our data center to run like a nuclear sub.

Our revamped team of engineers determined  that the then-current N+1 design did not meet their requirements, so they changed it and implemented the concept of a 2N+2 design. Their work was recognized last week as RagingWire announced the issuance of Patent #8,212,401 for “redundant isolation and bypass of critical power equipment.” This is one of 2 patents that resulted from RagingWire’s outage in 2006 and our efforts to design a system that would never go down again.

RagingWire’s systems are built to a 2N+2 standard. RagingWire exceeds the Uptime Tier IV standard by providing fault tolerance during maintenance. We call this “fix one, break one” or FOBO. This means that any active component – UPS, generator, chiller, pump, fan, switchboard , etc. – can be removed from service for maintenance, any other active component can fail, AND we can experience a utility outage, all without loss of power or cooling to the server rack. Having this extra level of redundancy allows RagingWire to perform more maintenance, and to do so without worry about a loss in availability. This enables us provide a 100% uptime SLA, even during maintenance windows.

List of data center outagesLooking at the last year and a half, it’s clear that many data centers are still providing their customers an inferior N+1 design. How do you know? Simply look at the number of providers below who have suffered data center outages over the past 18 months. Since 2006, RagingWire has had 100% availability of its power and cooling infrastructure due to its superior 2N+2 design. If your current provider is still offering N+1, maybe it’s time to ask yourself if N+1 is still good enough for you.
 

October 22, 2012Amazon Web Services suffered an outage in one of its data centers that takes down multiple customers in its US-East-1 region.  The problem was attributed to a “small number” of storage volumes that were degraded or failed.

October 8, 2012 – A cable cut took down Alaska Airlines’ ticketing and reservation system, causing delays across the airlines’ operations and preventing customers from checking in for flights.

August 7, 2012 – A fiber cut takes nonprofit Wikipedia offline for an hour.

July 28, 2012Hosting.com powered off 1,100 customers due to human error during preventative maintenance on a UPS in their Newark, De data center.

July 10, 2012Level3 East London data center offline for 5 hours after a UPS bus-bar failed.

July 10, 2012Salesforce.com suffers worldwide outage after a power failure in one of Equinix’ Silicon Valley data centers.

June 29, 2012Amazon Web Services suffers a power outage in its Northern Virginia data center. Multiple generators failed to start automatically due to synchronization issues, and had to be started manually.

June 14, 2012Amazon Web Services suffers a power outage in its Northern Virginia data center. The problem was blamed on a defective generator cooling fan and a mis-configured power breaker.

June 13, 2012 – US Airways had a nationwide disruption of their computer system, affecting reservations, check-in and flight status due to a power outage at their AT&T data center in Phoenix.

January 20, 2012 – A power failure in Equinix SV4 data center took several customers including Zoho offline.

October 10, 2011Research in Motion cut Blackberry service to most of Europe for 6 hours due to a power failure in their Slough, UK data center. The outage caused service disruptions for 3 days worldwide.

August 11, 2011Colo4 in Dallas, TX failed an automatic transfer switch, resulting in a 6 hour power outage.

August 7, 2011 - Amazon Web Services Dublin, Ireland data center lost power due to a generator phase-synchronization error, disrupting service to the EU West region.

Data Center Emergency Preparedness: When too much is not enough.

One of the most important aspects of data center operations is risk management or mitigation. Data center operators typically operate with a proactive mentality in order to properly react to any given situation which ultimately reduces the risk exposure of the facility. Training, preventive maintenance, and regular system or equipment testing becomes second nature as these facilities are expected and do (for the most part) operate seamlessly 24x7x365.25 days a year; however, it’s the once in a while event which tests the true resiliency of the facility and pushes the operations staff to their limits.

An acute level of attention to detail and complete ownership of the facility are common characteristics demonstrated by our operations staff. The team works tirelessly to ensure that they are ready for any given scenario. Emergency preparedness checklists are created, inventories are taken, and procedures are created for most common scenarios for events which carry the greatest potential to take place within the facility; however, we often find that with all of our efforts dedicated to ensuring our preparedness… it’s not enough.

Recent events within the Northeast are bringing to light scenarios which the operations team may not be prepared to handle in order to ensure the continuous uptime of the critical mission. Real world examples are as follows: generator loss of fuel delivery requiring re-priming of the fuel system, emergency redistribution of proprietary electrical feeds at the rack level, unusual roof leaks, flooding, staffing relief plans, and communications challenges. When creating the emergency operating procedures or casualty control procedures, emphasis must be placed on scenarios whereas the staff must be able to react and focus on the fact that external help is not available.

One very real scenario that we should proactively run drills on is generator loss of prime and how to re-prime engines due to fuel flow issues or based on the need to change the fuel pre-filters. Many of us do not have paralleled pre-filter assemblies, which means that once fuel pressure starts dropping due to extended run time, the fuel pre-filters will have to be changed which increases the chances for a loss of fuel prime. Onsite staff must have the ability to change the fuel filters and must be trained on re-priming the engines particularly when help is not on the way. The only way to do this is to proactively train each member of the operations team through a hands-on approach. Along those same lines, true drills need to be run within the facility and critiques of each drill should be held in order to analyze how the staff performed and continually improve upon existing processes and procedures.

Remember, the data center operation and uptime is just as important as a continually operating nuclear power plant. Here at RagingWire, we employ many former nuclear power operators from the Navy and the civilian sector which couldn’t be better examples of critical facilities operators. We each bring to the table a dedicated critical mentality which is seen not only in our data center infrastructure design or operation but also in the way that we work together within the data center community. As we all continue to recover from the devastating conditions experienced over the last several days, RagingWire is providing unwavering support to our colleagues who are continuing to operate in these adverse conditions.

Impact of Hurricane Sandy in the US east coast

RagingWire’s Northern Virginia data center campus, located in Ashburn, sustained no damage from the hurricane and remained on utility power for the duration of Hurricane Sandy’s assault on the East coast. Our thoughts are with those who continue to recover from the storm and subsequent damage.

Data Center Power Availability is More Than a Number

It can be difficult to appreciate the differences between data center power delivery architectures. Every data center provider you talk with has a power story to tell and most of them sound pretty good. The challenge is selecting the power architecture that is right for you.

One way to compare power delivery systems is to look at overall availability. Availability is usually expressed as a percentage of the total time the system is expected to be running. This is where the number of 9’s come in. You might find availability percentages of three 9’s (99.9%), four 9’s (99.99%), etc.

While a number might make you feel better at night, it won’t necessarily keep the phone from ringing. I suggest you look for three words when evaluating data center power delivery systems: Redundant, Distributed, and Scalable.

Redundant – multiple independent systems

This is the “N” you hear so much about in data centers. Basically, N is the amount of a component that you need to deliver a service. For example, you need one power path from the utility to your server rack. This path could include multiple pieces of equipment including a main switchboard, backup generator, UPS (uninterruptible power supply), and a PDU (power distribution unit). A second independent power path with all of those elements to the same server rack would be 2N. If there is a break in the first path, then the second path takes over. Ask your data center provider how they design for redundancy in all critical systems. The most fault tolerant way to keep your system running is to have multiple N’s. The challenge is that too many N’s can be expensive to acquire and complicated to manage.

Distributed – a resource pool + backup(s)

This is the “+1” in an N+1 design or the “+2” in an N+2 design. When the costs of the redundant architecture become prohibitive, a distributed approach for critical elements of the infrastructure is a great way to improve overall system reliability. You take a device and set it up as a spare for the required pool of devices. Say you need five UPSs to run the pod and you have one additional UPS that can backup any of the five in the pool – that’s N+1. Two spares for the pool means N+2. The critical element of this configuration is the monitoring and management system that must recognize a device failure and automatically switch to the back-up device.

Scalable – engineered for growth and change

Data centers are continually growing and changing all at the same time. To deliver superior service at an affordable price, the data center should be built out based on usage. The shell may be in place day 1, but the power and cooling should be purchased and installed as customers move in. Otherwise you are paying too much, too soon. Also, within the IT cages, servers, storage devices, and network gear are being added, removed, upgraded, and relocated. Your data center needs to be engineered for both growth and change. Power and cooling systems should accept additional devices as capacity requirements grow. Live IT power load should be dynamically shared or moved across the entire facility. All of this must occur without an outage.

Ragingwire data center - 100% power availabilityAt RagingWire, we’ve coined a name for our redundant, distributed, and scalable power deliver architecture – we call it 2N+2. We have two patents on the technology and offer a 100% Availability SLA (service level agreement) with these configurations.

How can we be so confident that with our 2N+2 architecture your power will not go down? We have 2N redundancy on the power paths to your cage or rack and an N+2 distributed design for the critical elements in the power delivery system. Lastly, one of our patented inventions is a unique cross-facility power switching fabric and a massively scalable topology that allows us to move, share, and scale live IT power load throughout the data center without requiring a maintenance outage.

For data centers, the old adage definitely still applies: “You can’t manage what you don’t measure.” Availability numbers are a great metric to manage your data center power delivery system. However, when choosing the right data center colocation solution, be sure to look for power delivery systems that are redundant, distributed, and scalable.

Why You Should Know About NANOG

I had the pleasure to attend last week’s 57th meeting of the North American Network Operators’ Group (NANOG) in Orlando, FL as part of the RagingWire team, which included folks from our data center and network engineering groups. Not only was the weather a balmy 70 degrees, but it was one of the most informative events I have been to in my career thus far. The NANOG meeting covered some of the most important and controversial topics on the cutting edge of internet security, connectivity, and governance going into 2013. Some highlights:

World Conference on International Telecommunications (WCIT) Update.

Put simply, NANOG is a community of network operators who exchange technical and operational information in support of a single goal: to make the internet as connected and resilient as possible, ensuring free flow of information around the world. With this end-state in mind, internet governance is a topic of much conversation and consternation among the NANOG members who attend this session of the meeting.

The main thrust of the December 2012 WCIT, was to update and ratify a new, 21st century iteration of the International Telecommunications Regulations (ITRs) which were last ratified in 1988. Sally Wentworth, a public policy manager at the Internet Society, presented a “postmortem” on the effects and way forward from the ITR treaty that was voted on at the conference. Much of the presentation focused on dangers posed by those nations that wish to regulate and/or have the capability to censor, on a nationwide scale, the availability or usefulness of the internet as a weapon to quell popular sentiment or anti-government organization. The presentation was a timely reminder that unfettered, low-cost access to the internet is an ideal that must be protected. Ms. Wentworth also called on the NANOG membership body, as an expert knowledge base, to be a contributor and party to making that ideal a reality both now and in the future.

The Infrastructure and Internet Impacts of Hurricane Sandy.

Two sessions during the NANOG meeting were dedicated to the effects of Hurricane Sandy on the internet and the infrastructure that supports it. One the first day of the meeting, several data center providers discussed their responses and lessons learned from their facilities located in the NJ and NY areas. This presentation really highlighted the importance of data center location from a risk management point of view, but it isn’t just about location. Protecting data center infrastructure is also about the pre-planning that must be in place before a natural disaster occurs: diesel fuel refueling contracts, reliable hotel arrangements (with reliable backup power), work-from-home arrangements, food storage at the facility, and staffing arrangements, to name a few.

The second day featured a session on the impacts of Hurricane Sandy on the internet and posed the question: What happens if we turn off power to one of the key traffic exchange cities? One of the most interesting presentations ensued, demonstrating the interconnectedness and flexibility at the core of the internet as trace routes changed in real time to pass through Ashburn, VA instead of NYC as Sandy made landfall.

Arbor Networks Infrastructure Security 2012 Report

This meeting session focused on a survey by Arbor Networks that explored the landscape of network threats and attacks (multi-vector, DDoS) over the past year. Top issues for network operators included DDoS attacks (trending towards multi-vector sustained attacks), enterprise data centers as vulnerable even with firewalls, the increased concern over “shared risk” in migrating applications to the cloud, and the inability of mobile service providers to have visibility into their networks in order to detect or combat any kind of attack. Most attack incidents appear to be motivated by ideology, be it politics or revenge. The presentation took a deep dive into the recent, persistent, multi-vector DDoS attacks, Annie George at NANOG 57nicknamed “Operation Ababil,” of many top Wall Street financial institutions that are ongoing. Overall, an eye-opening account of the ever-growing threat to network security, not just on a governmental level, but at the industrial/enterprise level as well.

If you work or play on the internet, you should know about NANOG, the Internet Society, and other groups who discuss topics like the ones presented last week – and who desire to keep the internet accessible and reliable for all of us. Congrats to NANOG members for the community they have built and the collective expertise they have to influence the international internet community.

Where the cloud lives – A look at the evolving distributed computing environment

The idea behind cloud computing has been around for some time. In fact, one of the very first scholarly uses of the term “cloud computing” was in 1997, at the annual INFORMS meeting in Dallas. However, the way the cloud has evolved into what we are capable of using today shows the amount of creativity and technological innovation that is possible. Distributed computing platforms are helping IT professionals conquer distance and deliver massive amounts of information to numerous points all over the world.

This is all made possible by a number of technologies all working together to bring cloud computing to life. Oftentimes, however, there is still some confusion around cloud computing – not so much how it works; but where it lives. Many will argue that virtualization gave birth to the cloud. Although server, application, network and other types of virtualization platforms certainly helped shape and mold cloud computing, there are other – very important – pieces to the cloud puzzle.

One IT landscape with many clouds in the sky

The very first concept that needs to be understood is that there isn’t one massive cloud out there controlling or helping to facilitate the delivery of your data. Rather, there are numerous interconnected cloud networks out there which may share infrastructure as they pass each other in cyberspace. Still, at the core of any cloud, there are key components which make the technology function well.

  • The data center. If the cloud has a home, it would have to be a data center. Or more specifically a neighborhood of data centers. Data centers house the integral parts that make the cloud work. Without servers, storage, networking, and a solid underlying infrastructure – the cloud would not exist today. Furthermore, new advancements in high-density computing are only helping further progress the power of the cloud. For example, Tilera recently released their 72-core, GX-72 processor. The GX-72 is a 64-bit system-on-chip (SoC) equipped with 72 processing cores, 4 DDR memory controllers and a big-time emphasis on I/O. Now, cloud architects are able to design and build a truly “hyper-connected” environment with an underlying focus on performance.

    Beyond the computing power, the data center itself acts as a beacon for the cloud. It facilitates the resources for the massive amounts of concurrent connections that the cloud requires and it will do so more efficiently over the years. Even now cloud data centers are trying to be more and more efficient. The Power Usage Effectiveness (PUE) has been a great metric for many cloud-ready data centers to manage the energy overhead associated with running a data center. More and more data centers are trying to approach the 1.0 rating as they continue to deliver more data and do it more efficiently. With the increase in data utilization and cloud services, there is no doubt that the data center environment will continue to play an absolute integral part in the evolution of cloud computing.

  • Globalization. The cloud is spreading – and it’s spreading very fast. Even now, data centers all over the world are creating services and options for cloud hosting and development. No longer an isolated front – cloud computing is truly being leveraged on a global scale. Of course, technologies like file-sharing were already a global solution. However, more organizations are becoming capable of deploying their own cloud environment. So, when we say that the cloud lives in a location – we mean exactly that.

    Historically, some parts of the world simply could not host or create a robust cloud environment. Why? Their geographic region could not support the amount of resources that a given cloud deployment may require. Fortunately, this is all changing. At the 2012 Uptime Symposium in Santa Clara, CA we saw an influx of international data center providers all in one room competing for large amounts of new business. The best part was that all of these new (or newly renovated) data centers now all had truly advanced technologies capable of allowing massive cloud instances to traverse the World Wide Web. This is a clear indication that, geographically, the cloud is expanding and that there is a business need for it to continue to do so.

  • Consumerization. One of the absolutely key reasons the cloud is where it is today is because of the cloud consumer. IT consumerization, BYOD, and the absolute influx of user-based connected devices have generated an enormous amount of data. All of this information now needs to span the Internet and utilize various cloud services. Whether it’s a file-sharing application or a refrigerator that is capable of alerting the owner that it’s low on milk – all of these solutions require the use of the cloud. Every day, we see more devices connect to the cloud. To clarify, when we mean devices – we’re not just referring to phones, laptops, or tablets. Now, we have cars, appliances, and even entire houses connecting to a cloud services. The evolution of the cloud revolves around the demands created by the end-user. This, in turn, forces the technology community to become even more innovative and progressive when it comes to cloud computing.

    As more global users connect to the Internet with their devices – the drive to grow the cloud will continue. This is why, in a sense, the cloud will eventually live with the end-user. Even now new technologies are being created to allow the end-user to utilize their own “personal cloud.” This means every user will have their own cloud personality with files, data, and resources all completely unique to them.

  • Cloud Connectors. As mentioned earlier, there isn’t really just one large cloud out there for all of users trying to access it. The many private, public, hybrid, and community clouds out there comprise one massive interconnected cloud grid. In that sense, the evolution of the cloud created an interesting, and very familiar, challenge: a language barrier. For example, as one part of an enterprise grows its cloud presence, another department might also begin a parallel cloud project on a different platform. Well, now there is a need to connect the two clouds together. But, what if these two environments are on completely different cloud frameworks? It’s in this sense that we deploy the cloud “Babel Fish” in the form of APIs. These APIs literally act as cloud connectors to help organizations extend, merge, or diversify their cloud solutions.

    It’s not a perfect technology and even now not all cloud platforms can fully integrate with others. However, the APIs are getting better and more capable of supporting large cloud platforms. New technologies like CloudStack or OpenStack help pave the way for the future of cloud connectivity and APIs. Platforms like this work as open-source cloud software solutions which help organizations create, manage, and deploy infrastructure cloud services.

In a cloudy world – bring an umbrella!

Let’s face facts – it’s not always sunny in the cloud. As the technology continues to emerge, IT professionals are still learning some best practices and optimal ways to keep the cloud operational. This isn’t proving altogether easy. There are more users, more infrastructure, and more bad guys taking their aim at various cloud infrastructures.  Just as important as it is to understand how the cloud functions and where it resides – it’s vital to know the “weather forecast” within the cloud computing environment.

  • Attacks. Although this is the darker side of the cloud, it still needs to be analyzed. As more organizations move towards a cloud platform, it’s only logical to assume that these cloud environments will become targets. Even now, attacks against cloud providers are growing. This can be a direct intrusion or a general infrastructure attack. Regardless of the type, all malicious intrusions can have devastating results on a cloud environment. Over the past few months, one of the biggest threats against a cloud provider has been the influx of DDoS attacks. A recent annual Arbor Networks survey showed that 77% of the data center administrator respondents experienced more advanced, application-layer attack. Furthermore, such attacks represented 27 percent of all attack vectors. The unnerving part here is that the ferocity of these attacks continues to grow achieving a 100Gbps spike in 2010.

    Cloud services aren’t always safe either. On February 28, 2013 – Evernote saw its first signs of a hacking. Passwords, emails and usernames were all accessed. Now, the provider is requiring its nearly 50 million users to reset their passwords. The damage with these types of attacks isn’t always just the data. Evernote had to further release a response – and these are always difficult to do. In the responding blog post, Phil Libin, Evernote’s CEO and founder said the following, “Individual(s) responsible were able to gain access to Evernote user information, which includes usernames, email addresses associated with Evernote accounts and encrypted passwords.” These types of intrusions can only help serve as learning points to create better and more secure cloud environments.

  • Outages. If you place your cloud infrastructure into a single data center – you should know that your cloud environment can and will go down. No one major cloud provider is safe from some type of disaster or outage. Furthermore, cloud computing services are still an emerging field and many data center and cloud providers are still trying to figure out a way to create a truly resilient environment.

    The most important point to take away here is that a cloud outage can literally happen for almost any reason. For example, a few administrators for a major cloud provider forgot to renew a simple SSL certificate. What happened next is going to be built into future cloud case studies. Not only did this cause an initial failure of the cloud platform, it created a global – cascading – event taking down numerous other cloud-reliant systems. Who was the provider? Microsoft Azure. The very same Azure platform which had $15 billion pumped into its design and build out. Full availability wasn’t restored for 12 hours – and up to 24 hours for many others. About 52 other Microsoft services relying on the Azure platform experienced issues – this includes the Xbox Live network.  This type of outage will create (and hopefully answer) many new types of questions as far as cloud continuity and infrastructure resiliency.

As with any technological platform, careful planning and due diligence has to go into all designs and deployments. It’s evident with the speed of today’s technology usage that the world of cloud computing is only going to continue to expand. Where the cloud lives - blog article by Bill KleymanNew conversations around big data further demonstrate a new need for the cloud. The ability to transmit, analyze and quantify massive amounts of data is going to fall onto the cloud’s shoulders. Cloud services will be created to churn massive amounts of information on a distributed plane. Even now platforms are being used around open-source technologies to help control and distribute the data. Projects like Hadoop and the Hadoop Distributed File System (HDFS) are already being deployed by large companies to control their data and make it agile.

With more users, more connection points and much more data – cloud computing lives in a growing and global collection of distributed data centers.  It is critical that cloud developers and participants select their data center platforms carefully with an emphasis on 100% reliability, high-density power, energy-efficient cooling, high-performance networking, and continual scalability. Moving forward, the data center will truly be the main component as to where the cloud resides.

Introducing RagingWire Backup Services

I used to be a tape-based data backup manager – a tape labeler -- a tape rotator – a tape scheduler – a tape library/spreadsheet manager – a restore-from-tape nightmare wake-upper. My tape-based backup programs generally worked as they should, but they involved a lot of manual labor that could have been devoted to other tasks. Over two years as an IT manager, it became clear to me that there had to be a simpler, more reliable way to backup data.

That’s why I’m excited to introduce RagingWire’s new Backup Services, a disc-based enterprise grade data backup platform that automates onsite storage and offsite replication for customers inside RagingWire’s world-class data centers. Best of all – no more tapes!

RagingWire’s New Backup Services – Engineered for the Enterprise

We utilize the same best-in-breed enterprise data protection technology for our Backup Services that many large enterprises use for their own data backups: Symantec NetBackup™. The product supports multiple databases and applications as well as both physical and virtual servers in order to automate and standardize your backups onto one unified platform. Secure encryption at the customer device keeps your data safe and sophisticated deduplication technology shortens your server backup windows. RagingWire’s 24x7 Network Operations Center is just a phone call or email away when you need to restore.

Onsite Storage. Offsite Replication.

As a company, RagingWire focuses on providing the nation’s best data center colocation services. We back up our 100% Service Level Agreement (SLA) with the most advanced data center infrastructure and we have the happiest customers in the industry. So why is RagingWire introducing Backup Services? Simply speaking, our customers ask for disc-based backup services located inside both our data center campuses in Ashburn and Sacramento. Options include onsite storage in a customer’s current RagingWire data center and offsite replication to a customer’s remote RagingWire data center. We want to give our customers the assurance that unlike a cloud backup, they’ll always know where their data is. It’s located inside a RagingWire data center and our customers have access to restore 24x7 via our expert Network Operations Center staff.

Are you a current RagingWire customer? Take a minute to check out the Backup Services data sheet or contact your account manager to see how we can help you get rid of your tapes and simplify your data backup strategy.

Curious as to how you can further optimize your data backup strategy? Head over to Jerry Gilreath’s excellent blog post: 5 Ways to Fool-Proof Your Data Backup Strategy.

Pages

 
 

RagingWire MegaMenu

 
 
 
Virginia Data Center Campus
Virtual Walkthrough
RagingWire Ashburn Virginia Data Center

Data Center Colocation Resources

RagingWire Ashburn, Virginia Data Center Campus
RagingWire Virginia Data Center Campus
RagingWire California Data Center Campus
RagingWire California Data Center Campus