Infrastructure

From Silicon Valley to Data Center Alley

Since the birth of the integrated circuit in the 1950s, Silicon Valley has become the destination for high tech entrepreneurship. Located in Northern California, the term "Silicon Valley" was coined in the 1970s and gained popularity in the 1980’s with the emergence of the personal computer. In Silicon Valley, capacity and capability came together to create some of the greatest technical innovations in history.

This same dynamic of capacity meeting capability that happened decades ago in Silicon Valley is underway in Loudoun County Virginia. We’re calling it Data Center Alley, the largest concentration of the best data centers in the world.

Capacity refers to the raw materials needed to create a thriving data center community: ample telecommunications; reliable, cost-effective utility power; and available land.

Hundreds of telecommunications providers include Data Center Alley as a link in their national and global networks. These networks interconnect using vast amounts of fiber installed in redundant loops throughout the area. The result is that 70% of the world’s internet traffic passes through Data Center Alley.

For utility power, we are fortunate to work with Dominion Virginia Power. Dominion recognized early on the potential for data centers in Northern Virginia and implemented a capacity model that ensured that sufficient power would be available to meet the needs of Data Center Alley at affordable prices. They worked closely with data center companies to configure their power delivery system so that power was highly reliable. Finally, Dominion has been a good steward of our energy infrastructure by maintaining an intelligent mix of available and environmentally sound energy sources.

The last raw material is land. Data centers need space in order to realize economies of scale. For example, our data center in Ashburn, Virginia is 150,000 square feet and we purchased 78 acres of land in Ashburn to build a1.5 million square foot data center campus. The land also needs to be located near the telecommunications and utility supplies and with easy access.

Capability refers to the people, government policies, and culture that promote building great data centers and growing the data center industry.

The data center industry is a highly specialized field that requires deep expertise in engineering, design, construction, and operations. Much of this expertise comes from on-the-job experience. Data Center Alley has more than 40 data centers which support an outstanding talent pool of data center experts.

Government policies have been instrumental in the development of Data Center Alley. Virginia is one of the most pro-business states in the U.S., and the Loudoun County Board of Supervisors and Department of Economic Development are personally involved in helping data center companies be successful. For example, RagingWire customers can qualify for a Virginia sales tax exemption which could save them millions of dollars on purchases of computer equipment and other related data center infrastructure components.

Lastly, the culture in Data Center Alley is all about building. We put theory into practice and scale it. The result is that there is currently eight million sq. ft. of data center space already built or in development in Data Center Alley.

Data Center Alley is starting to get some recognition. If you want to learn more, watch a segment on Data Center Alley from the Sunday morning news program “Government Matters.”

Welcome to Next-Generation DCIM (Data Center Infrastructure Management)

We are in the era of on-demand data delivery. The proliferation of cloud computing and information sharing has created a sort of data center boom. There are more users, more devices and a lot more services being delivered from the modern data center environment.

In fact, entire organizations and applications are being born directly within the cloud. To really put this trend in perspective, consider this - A 2013 Cisco Cloud Index report indicated that cloud traffic crossed the zettabyte (1000^7 bytes) threshold in 2012. Furthermore, by 2016, nearly two-thirds of all data center traffic will be based in the cloud.

Cisco Cloud Index Report - 2013

Efficient computing, converged infrastructures, multi-tenant platforms, and next-generation management are the key design points for the modern data center environment. Because we are placing more services into our data centers – there is greater need for visibility into multiple aspects of daily operations.

The picture of the next-generation management platform is that of a truly unified management plane

So what does that really mean? A unified data center infrastructure management platform removes “physical” barriers of a typical control system. The reason for this necessary shift can be seen in the job that the modern data center is now tasked with performing and the levels of monitoring that must occur. Next-generation DCIM (Data Center Infrastructure Management) will remove logical and physical walls and unify the entire data center control process.

"Everything-as-a-Service" Controls. The modern data center is now considered to be at the heart of "The Internet of Everything." This means that more services, information, and resources are being delivered through the data center than ever before. As the data center continues to add on new functions and delivery models – administrators will need to have visibility into everything that is being delivered. Whether it’s monitoring SLAs or a specific as-a-service offering – next-generation data center management must have integration into the whole infrastructure.

Cluster-Ready Real-Time Monitoring. There is true data center distribution happening. These nodes are inter-connected, sharing resources, and delivering content to millions of end-users. With advancements in modern data center infrastructure, bandwidth and optimized computing – data center architecture has gone from singular units to truly clustered entities. This level of connectivity also requires new levels of distributed management and control. Not only is this vital for proper resource utilization and load-balancing; cluster-ready monitoring creates greater amounts of data enter resiliency. By having complete visibility into all data center operations across the board, administrators are able to make better – proactive – decisions.

Big Data Management Engines. The increase of cloud services has created an influx of new types of offerings to the end-users. IT consumerization, BYOD and mobility have all become very hot topics around numerous large organizations. As cloud continues to grow – more users will be utilizing these resources. With an increase of users comes the direct increase of data. This is where next-generation data center management comes in. Big data engines will sit both within the cloud and on the edge of the cloud network – also known as the Fog. There, these data centers will have direct tie-ins into big data analytics engines running on virtual and physical platforms. Because this data and the information gathering process is so crucial – complete visibility into the entire process is vital. This means that large data centers acting as hubs for big data analytics will have control visibility into storage, networking, compute, infrastructure and more.

Logical and Physical Management. The days of bare metal accumulation are over. Modern data centers are highly virtualized and highly efficient. New data centers are being built around optimal power, cooling and resource control mechanisms. On top of that – sit highly efficient high-density servers capable of great levels of multi-tenancy. Although some silo’d monitoring operations may still occur – the overall infrastructure must be unified in terms of management and control. This means having granular visibility into core data center operations which includes:

  • Power
  • Cooling
  • Aisle/Control
  • Alerts/Monitoring
  • Advanced Environmental Monitoring
  • Proactive Alerting and Remediation
  • Security (physical, structural, rack, etc.)

On top of that, data center administrators must also be able to see the workloads which are running on top of this modern infrastructure. This means understanding how virtual resources are being utilized and how this is impacting the underlying data center environment.

Cloud Orchestration and Automation. The cloud computing boom also created a big need around better workload control and automation. This actually spans both the hardware and software layer. From a next-generation management perspective, there needs to be the ability to create hardware and software profiles which can then be applied to physical and virtual resources for deployment. Finally, when this approach is tied into intelligent load-balancing solutions, you have a truly end-to-end cloud orchestration and automation solution. Now, although we’re not quite at those levels yet, next-generation data center management solutions are directly integrating workload automation options. It is becoming easier to deploy hardware and simpler to provision workloads. All of this means faster content and data delivery to the corporation and the end-user. Next-generation data center management will be able to have plug-ins into the physical and logical layer and facilitate new levels of cloud automation and orchestration. 

Mobile Data Center Visibility. It’s not like your data center will just get up and move around – at least not in most cases. However, having the ability to have mobile visibility of your data center is a need. This means controlling some data center functions from mobile devices and delivering direct web-based controls as well. Furthermore, because the data center is becoming more interconnected – there will be more functions and roles to control. There will be various types of administrators and mangers requiring specific controls within a single and clustered data center model. Role-based administration and management will evolve from the standard PC to true mobility. All of this will translate to a more efficient engineering, administration and management layer of your entire data center infrastructure.

Single-Pane of Glass – Complete Control. At the end of the day it all comes down to how well you can manage your data center. There’s no arguing that there will be more requirements around the future data center platform. As the number of services that the data center delivers continues to increase, complete visibility will become even more important. There will be more plug-ins, monitoring tools, and clustered nodes to look at all while trying to control resiliency. The next-generation monitoring UI and control platforms must be intuitive, easy to understand, simple to navigate and allow the administrator to truly optimize the entire data center infrastructure.

Tomorrow’s data centers must also have tomorrow’s visibility and control. The nature of the data center is changing. It’s now the hub that delivers core services, user applications, and entire platforms. IT consumerization and the increase in Internet utilization are partially the reason for the data center boom. However, the natural progression of technology has taken our entire infrastructure into a highly resilient and agile cloud platform. The ability to stay connected and have massive content delivered directly to your end-point is truly impressive.

As the business evolves, the data center infrastructure will have to follow suit. Already business initiatives are directly aligned with the capabilities of the respective IT department. This correlation will only continue to increase as the data center becomes the home of everything. And with the next-generation data center - there must be next-generation management and control.

Making Every (Inter)Connection Count

It’s been an interesting week or two of data center news! “London Internet Exchange takes space in EvoSwitch.”  “Digital Realty announces Open Internet Exhange.” “Open-IX movement goes public.”

So what is happening here? What is the problem that is solved with “open” internet exchanges?

As a frequent North American Network Operators Group (NANOG) meeting participant, I’ve heard growing angst in the internet peering ranks about perceived points of failure presented by having single buildings in major internet hubs (e.g. New York, Ashburn, London, Amsterdam) house commercial internet exchanges. Remember Hurricane Sandy? Beyond geography, questions were raised over the treatment of telecommunications carriers and the manner in which interconnections are made as opposed to the European interconnection model (member-driven, multi-site, public).

The biggest problem Open-IX is trying to solve, however, has nothing to do with geographic diversity or carrier treatment. It’s simple economics. In the United States, the major Internet exchanges are concentrated in the hands of a few data center companies and those companies charge carriers a premium for the right to participate in the exchange. Open-IX lays this case out in their framework document as “The Interconnect Problem.”

RagingWire operates, from an interconnection point of view, in line with open internet exchange principles. All of the company’s data center facilities are carrier neutral.

RagingWire Data Centers - Marrier Meet Me Room

Carriers built into our data center aren't our customers, they're our partners in bringing highly available connectivity to our customers. Our network engineers are dedicated to building trusted, close relationships with all our carrier partners to make the ordering and provisioning process as easy and seamless as possible.

Open-IX is still in its infancy, but we look forward to continuing our long relationship with the participants. We share the desire to continually improve service and reduce costs for our customers. RagingWire is the nation’s leading data center colocation provider, focused on delivering 100% availability of power and cooling with easy access to internet connectivity and the industry’s best customer service. It’s all part of our commitment to making every connection count.

Blog Tags:

Would Your Employees Recommend You?

I have had the pleasure of leading customer-facing support organizations for the past 15 years, and during that time I have managed hundreds of employees. Over time I noticed a common thread: the happier the employees were, the happier the customers were. This is because the same set of values drives both employee experience and customer experience (job seekers: if a company has a reputation for bad service, run the other way!). 

RagingWire has the highest customer loyalty Net Promoter Score (NPS) in the data center industry. And in achieving this we have found that you simply cannot delight your customers without first delighting your employees. Here are ten ways to improve employee experience: 

  1. Focus – put your customer at the core of your business. Use customer experience to drive your culture. Use it as the driving force behind everything you do. Why? Because without customers, your business will fail. 
  1. Purpose – you must have a strategic plan. Communicate what the big picture is and how each employee fits into that picture. Better yet, develop the strategic plan with employee input. We all want to be part of a company that is growing, and having a well-defined strategic plan is the foundation for that growth.
  1. Communication – communicate with employees, both the good and the bad. Share the big picture with them, tell them where the organization is doing well and where there are challenges. Be transparent, approachable, sincere and humble. And don’t forget to have fun!
  1. Recognition – lack of recognition ranks as the number one reason behind job dissatisfaction, even ahead of money. We spend a third of our lives at work, and we want to be appreciated when we do a good job. By our peers, by our manager, by other managers, by the mailman, everyone!
  1. Growth – just as we want our business to grow, our employees want to grow too. And development doesn’t have to happen in an expensive classroom. It can happen by reading books, attending webinars, writing blogs, being assigned a challenging project, facilitating a lunch and learn, mentoring, job shadowing, the list goes on.
  1. Opportunity – I have yet to meet an employee not interested in career advancement. Makes #5 on the list all the more important, doesn’t it?  A key role of being an effective manager means developing your employees. This will allow more promoting from within, which makes for happier employees but also more expertise – those who move to different positions and departments gain greater company perspective. And they are more helpful too. It’s the “I’ve been in her shoes” effect, and it’s powerful.
  1. Flexibility – focus on results, not hours. People have lives outside of work – families, charities, spirituality needs, healthcare, etc and if you allow them flexibility in managing those things, you will earn their loyalty. 
  1. Listen – trust me, at this exact moment your employees have ideas that could improve your business. Put measures in place to solicit those ideas! And don’t be defensive if you hear something you don’t like; focusing on continuous improvement isn’t being negative, it’s being strategic.
  1. Trust – hire a talented team, set the direction for that team, give the team tools to do their job, and then get out of their way. Valuable employees don’t want to be micromanaged. Focus on results, and don’t sweat the small stuff.
  1. Give Back – if you want employees to take ownership, give them ownership. Employees want the ability to share financially in the company’s success, so incentives like performance-based bonus plans can be great motivators. 

There. Ten things you can start doing today to improve employee experience. And customer experience. And profitability!

(Attention employees: you’re not off the hook here. Have a service attitude at all times. Be open to change, and be positive. Share knowledge. Be a team player. Be patient. Show initiative. Execute.)

Apply at RagingWire Data Centers today at https://www.ragingwire.com/company-profile/career

Introducing RagingWire Backup Services

I used to be a tape-based data backup manager – a tape labeler -- a tape rotator – a tape scheduler – a tape library/spreadsheet manager – a restore-from-tape nightmare wake-upper. My tape-based backup programs generally worked as they should, but they involved a lot of manual labor that could have been devoted to other tasks. Over two years as an IT manager, it became clear to me that there had to be a simpler, more reliable way to backup data.

That’s why I’m excited to introduce RagingWire’s new Backup Services, a disc-based enterprise grade data backup platform that automates onsite storage and offsite replication for customers inside RagingWire’s world-class data centers. Best of all – no more tapes!

RagingWire’s New Backup Services – Engineered for the Enterprise

We utilize the same best-in-breed enterprise data protection technology for our Backup Services that many large enterprises use for their own data backups: Symantec NetBackup™. The product supports multiple databases and applications as well as both physical and virtual servers in order to automate and standardize your backups onto one unified platform. Secure encryption at the customer device keeps your data safe and sophisticated deduplication technology shortens your server backup windows. RagingWire’s 24x7 Network Operations Center is just a phone call or email away when you need to restore.

Onsite Storage. Offsite Replication.

As a company, RagingWire focuses on providing the nation’s best data center colocation services. We back up our 100% Service Level Agreement (SLA) with the most advanced data center infrastructure and we have the happiest customers in the industry. So why is RagingWire introducing Backup Services? Simply speaking, our customers ask for disc-based backup services located inside both our data center campuses in Ashburn and Sacramento. Options include onsite storage in a customer’s current RagingWire data center and offsite replication to a customer’s remote RagingWire data center. We want to give our customers the assurance that unlike a cloud backup, they’ll always know where their data is. It’s located inside a RagingWire data center and our customers have access to restore 24x7 via our expert Network Operations Center staff.

Are you a current RagingWire customer? Take a minute to check out the Backup Services data sheet or contact your account manager to see how we can help you get rid of your tapes and simplify your data backup strategy.

Curious as to how you can further optimize your data backup strategy? Head over to Jerry Gilreath’s excellent blog post: 5 Ways to Fool-Proof Your Data Backup Strategy.

Where the cloud lives – A look at the evolving distributed computing environment

The idea behind cloud computing has been around for some time. In fact, one of the very first scholarly uses of the term “cloud computing” was in 1997, at the annual INFORMS meeting in Dallas. However, the way the cloud has evolved into what we are capable of using today shows the amount of creativity and technological innovation that is possible. Distributed computing platforms are helping IT professionals conquer distance and deliver massive amounts of information to numerous points all over the world.

This is all made possible by a number of technologies all working together to bring cloud computing to life. Oftentimes, however, there is still some confusion around cloud computing – not so much how it works; but where it lives. Many will argue that virtualization gave birth to the cloud. Although server, application, network and other types of virtualization platforms certainly helped shape and mold cloud computing, there are other – very important – pieces to the cloud puzzle.

One IT landscape with many clouds in the sky

The very first concept that needs to be understood is that there isn’t one massive cloud out there controlling or helping to facilitate the delivery of your data. Rather, there are numerous interconnected cloud networks out there which may share infrastructure as they pass each other in cyberspace. Still, at the core of any cloud, there are key components which make the technology function well.

  • The data center. If the cloud has a home, it would have to be a data center. Or more specifically a neighborhood of data centers. Data centers house the integral parts that make the cloud work. Without servers, storage, networking, and a solid underlying infrastructure – the cloud would not exist today. Furthermore, new advancements in high-density computing are only helping further progress the power of the cloud. For example, Tilera recently released their 72-core, GX-72 processor. The GX-72 is a 64-bit system-on-chip (SoC) equipped with 72 processing cores, 4 DDR memory controllers and a big-time emphasis on I/O. Now, cloud architects are able to design and build a truly “hyper-connected” environment with an underlying focus on performance.

    Beyond the computing power, the data center itself acts as a beacon for the cloud. It facilitates the resources for the massive amounts of concurrent connections that the cloud requires and it will do so more efficiently over the years. Even now cloud data centers are trying to be more and more efficient. The Power Usage Effectiveness (PUE) has been a great metric for many cloud-ready data centers to manage the energy overhead associated with running a data center. More and more data centers are trying to approach the 1.0 rating as they continue to deliver more data and do it more efficiently. With the increase in data utilization and cloud services, there is no doubt that the data center environment will continue to play an absolute integral part in the evolution of cloud computing.

  • Globalization. The cloud is spreading – and it’s spreading very fast. Even now, data centers all over the world are creating services and options for cloud hosting and development. No longer an isolated front – cloud computing is truly being leveraged on a global scale. Of course, technologies like file-sharing were already a global solution. However, more organizations are becoming capable of deploying their own cloud environment. So, when we say that the cloud lives in a location – we mean exactly that.

    Historically, some parts of the world simply could not host or create a robust cloud environment. Why? Their geographic region could not support the amount of resources that a given cloud deployment may require. Fortunately, this is all changing. At the 2012 Uptime Symposium in Santa Clara, CA we saw an influx of international data center providers all in one room competing for large amounts of new business. The best part was that all of these new (or newly renovated) data centers now all had truly advanced technologies capable of allowing massive cloud instances to traverse the World Wide Web. This is a clear indication that, geographically, the cloud is expanding and that there is a business need for it to continue to do so.

  • Consumerization. One of the absolutely key reasons the cloud is where it is today is because of the cloud consumer. IT consumerization, BYOD, and the absolute influx of user-based connected devices have generated an enormous amount of data. All of this information now needs to span the Internet and utilize various cloud services. Whether it’s a file-sharing application or a refrigerator that is capable of alerting the owner that it’s low on milk – all of these solutions require the use of the cloud. Every day, we see more devices connect to the cloud. To clarify, when we mean devices – we’re not just referring to phones, laptops, or tablets. Now, we have cars, appliances, and even entire houses connecting to a cloud services. The evolution of the cloud revolves around the demands created by the end-user. This, in turn, forces the technology community to become even more innovative and progressive when it comes to cloud computing.

    As more global users connect to the Internet with their devices – the drive to grow the cloud will continue. This is why, in a sense, the cloud will eventually live with the end-user. Even now new technologies are being created to allow the end-user to utilize their own "personal cloud." This means every user will have their own cloud personality with files, data, and resources all completely unique to them.

  • Cloud Connectors. As mentioned earlier, there isn’t really just one large cloud out there for all of users trying to access it. The many private, public, hybrid, and community clouds out there comprise one massive interconnected cloud grid. In that sense, the evolution of the cloud created an interesting, and very familiar, challenge: a language barrier. For example, as one part of an enterprise grows its cloud presence, another department might also begin a parallel cloud project on a different platform. Well, now there is a need to connect the two clouds together. But, what if these two environments are on completely different cloud frameworks? It’s in this sense that we deploy the cloud “Babel Fish” in the form of APIs. These APIs literally act as cloud connectors to help organizations extend, merge, or diversify their cloud solutions.

    It’s not a perfect technology and even now not all cloud platforms can fully integrate with others. However, the APIs are getting better and more capable of supporting large cloud platforms. New technologies like CloudStack or OpenStack help pave the way for the future of cloud connectivity and APIs. Platforms like this work as open-source cloud software solutions which help organizations create, manage, and deploy infrastructure cloud services.

In a cloudy world – bring an umbrella!

Let’s face facts – it’s not always sunny in the cloud. As the technology continues to emerge, IT professionals are still learning some best practices and optimal ways to keep the cloud operational. This isn’t proving altogether easy. There are more users, more infrastructure, and more bad guys taking their aim at various cloud infrastructures.  Just as important as it is to understand how the cloud functions and where it resides – it’s vital to know the "weather forecast" within the cloud computing environment.

  • Attacks. Although this is the darker side of the cloud, it still needs to be analyzed. As more organizations move towards a cloud platform, it’s only logical to assume that these cloud environments will become targets. Even now, attacks against cloud providers are growing. This can be a direct intrusion or a general infrastructure attack. Regardless of the type, all malicious intrusions can have devastating results on a cloud environment. Over the past few months, one of the biggest threats against a cloud provider has been the influx of DDoS attacks. A recent annual Arbor Networks survey showed that 77% of the data center administrator respondents experienced more advanced, application-layer attack. Furthermore, such attacks represented 27 percent of all attack vectors. The unnerving part here is that the ferocity of these attacks continues to grow achieving a 100Gbps spike in 2010.

    Cloud services aren’t always safe either. On February 28, 2013 – Evernote saw its first signs of a hacking. Passwords, emails and usernames were all accessed. Now, the provider is requiring its nearly 50 million users to reset their passwords. The damage with these types of attacks isn’t always just the data. Evernote had to further release a response – and these are always difficult to do. In the responding blog post, Phil Libin, Evernote’s CEO and founder said the following, "Individual(s) responsible were able to gain access to Evernote user information, which includes usernames, email addresses associated with Evernote accounts and encrypted passwords." These types of intrusions can only help serve as learning points to create better and more secure cloud environments.

  • Outages. If you place your cloud infrastructure into a single data center – you should know that your cloud environment can and will go down. No one major cloud provider is safe from some type of disaster or outage. Furthermore, cloud computing services are still an emerging field and many data center and cloud providers are still trying to figure out a way to create a truly resilient environment.

    The most important point to take away here is that a cloud outage can literally happen for almost any reason. For example, a few administrators for a major cloud provider forgot to renew a simple SSL certificate. What happened next is going to be built into future cloud case studies. Not only did this cause an initial failure of the cloud platform, it created a global – cascading – event taking down numerous other cloud-reliant systems. Who was the provider? Microsoft Azure. The very same Azure platform which had $15 billion pumped into its design and build out. Full availability wasn’t restored for 12 hours – and up to 24 hours for many others. About 52 other Microsoft services relying on the Azure platform experienced issues – this includes the Xbox Live network.  This type of outage will create (and hopefully answer) many new types of questions as far as cloud continuity and infrastructure resiliency.

As with any technological platform, careful planning and due diligence has to go into all designs and deployments. It’s evident with the speed of today’s technology usage that the world of cloud computing is only going to continue to expand. Where the cloud lives - blog article by Bill KleymanNew conversations around big data further demonstrate a new need for the cloud. The ability to transmit, analyze and quantify massive amounts of data is going to fall onto the cloud’s shoulders. Cloud services will be created to churn massive amounts of information on a distributed plane. Even now platforms are being used around open-source technologies to help control and distribute the data. Projects like Hadoop and the Hadoop Distributed File System (HDFS) are already being deployed by large companies to control their data and make it agile.

With more users, more connection points and much more data – cloud computing lives in a growing and global collection of distributed data centers.  It is critical that cloud developers and participants select their data center platforms carefully with an emphasis on 100% reliability, high-density power, energy-efficient cooling, high-performance networking, and continual scalability. Moving forward, the data center will truly be the main component as to where the cloud resides.

Why You Should Know About NANOG

I had the pleasure to attend last week’s 57th meeting of the North American Network Operators’ Group (NANOG) in Orlando, FL as part of the RagingWire team, which included folks from our data center and network engineering groups. Not only was the weather a balmy 70 degrees, but it was one of the most informative events I have been to in my career thus far. The NANOG meeting covered some of the most important and controversial topics on the cutting edge of internet security, connectivity, and governance going into 2013. Some highlights:

World Conference on International Telecommunications (WCIT) Update.

Put simply, NANOG is a community of network operators who exchange technical and operational information in support of a single goal: to make the internet as connected and resilient as possible, ensuring free flow of information around the world. With this end-state in mind, internet governance is a topic of much conversation and consternation among the NANOG members who attend this session of the meeting.

The main thrust of the December 2012 WCIT, was to update and ratify a new, 21st century iteration of the International Telecommunications Regulations (ITRs) which were last ratified in 1988. Sally Wentworth, a public policy manager at the Internet Society, presented a "postmortem" on the effects and way forward from the ITR treaty that was voted on at the conference. Much of the presentation focused on dangers posed by those nations that wish to regulate and/or have the capability to censor, on a nationwide scale, the availability or usefulness of the internet as a weapon to quell popular sentiment or anti-government organization. The presentation was a timely reminder that unfettered, low-cost access to the internet is an ideal that must be protected. Ms. Wentworth also called on the NANOG membership body, as an expert knowledge base, to be a contributor and party to making that ideal a reality both now and in the future.

The Infrastructure and Internet Impacts of Hurricane Sandy.

Two sessions during the NANOG meeting were dedicated to the effects of Hurricane Sandy on the internet and the infrastructure that supports it. One the first day of the meeting, several data center providers discussed their responses and lessons learned from their facilities located in the NJ and NY areas. This presentation really highlighted the importance of data center location from a risk management point of view, but it isn’t just about location. Protecting data center infrastructure is also about the pre-planning that must be in place before a natural disaster occurs: diesel fuel refueling contracts, reliable hotel arrangements (with reliable backup power), work-from-home arrangements, food storage at the facility, and staffing arrangements, to name a few.

The second day featured a session on the impacts of Hurricane Sandy on the internet and posed the question: What happens if we turn off power to one of the key traffic exchange cities? One of the most interesting presentations ensued, demonstrating the interconnectedness and flexibility at the core of the internet as trace routes changed in real time to pass through Ashburn, VA instead of NYC as Sandy made landfall.

Arbor Networks Infrastructure Security 2012 Report

This meeting session focused on a survey by Arbor Networks that explored the landscape of network threats and attacks (multi-vector, DDoS) over the past year. Top issues for network operators included DDoS attacks (trending towards multi-vector sustained attacks), enterprise data centers as vulnerable even with firewalls, the increased concern over “shared risk” in migrating applications to the cloud, and the inability of mobile service providers to have visibility into their networks in order to detect or combat any kind of attack. Most attack incidents appear to be motivated by ideology, be it politics or revenge. The presentation took a deep dive into the recent, persistent, multi-vector DDoS attacks, Annie George at NANOG 57nicknamed “Operation Ababil,” of many top Wall Street financial institutions that are ongoing. Overall, an eye-opening account of the ever-growing threat to network security, not just on a governmental level, but at the industrial/enterprise level as well.

If you work or play on the internet, you should know about NANOG, the Internet Society, and other groups who discuss topics like the ones presented last week – and who desire to keep the internet accessible and reliable for all of us. Congrats to NANOG members for the community they have built and the collective expertise they have to influence the international internet community.

Data Center Power Availability is More Than a Number

It can be difficult to appreciate the differences between data center power delivery architectures. Every data center provider you talk with has a power story to tell and most of them sound pretty good. The challenge is selecting the power architecture that is right for you.

One way to compare power delivery systems is to look at overall availability. Availability is usually expressed as a percentage of the total time the system is expected to be running. This is where the number of 9’s come in. You might find availability percentages of three 9’s (99.9%), four 9’s (99.99%), etc.

While a number might make you feel better at night, it won’t necessarily keep the phone from ringing. I suggest you look for three words when evaluating data center power delivery systems: Redundant, Distributed, and Scalable.

Redundant – multiple independent systems

This is the "N" you hear so much about in data centers. Basically, N is the amount of a component that you need to deliver a service. For example, you need one power path from the utility to your server rack. This path could include multiple pieces of equipment including a main switchboard, backup generator, UPS (uninterruptible power supply), and a PDU (power distribution unit). A second independent power path with all of those elements to the same server rack would be 2N. If there is a break in the first path, then the second path takes over. Ask your data center provider how they design for redundancy in all critical systems. The most fault tolerant way to keep your system running is to have multiple N’s. The challenge is that too many N’s can be expensive to acquire and complicated to manage.

Distributed – a resource pool + backup(s)

This is the "+1" in an N+1 design or the "+2" in an N+2 design. When the costs of the redundant architecture become prohibitive, a distributed approach for critical elements of the infrastructure is a great way to improve overall system reliability. You take a device and set it up as a spare for the required pool of devices. Say you need five UPSs to run the pod and you have one additional UPS that can backup any of the five in the pool – that’s N+1. Two spares for the pool means N+2. The critical element of this configuration is the monitoring and management system that must recognize a device failure and automatically switch to the back-up device.

Scalable – engineered for growth and change

Data centers are continually growing and changing all at the same time. To deliver superior service at an affordable price, the data center should be built out based on usage. The shell may be in place day 1, but the power and cooling should be purchased and installed as customers move in. Otherwise you are paying too much, too soon. Also, within the IT cages, servers, storage devices, and network gear are being added, removed, upgraded, and relocated. Your data center needs to be engineered for both growth and change. Power and cooling systems should accept additional devices as capacity requirements grow. Live IT power load should be dynamically shared or moved across the entire facility. All of this must occur without an outage.

Ragingwire data center - 100% power availabilityAt RagingWire, we’ve coined a name for our redundant, distributed, and scalable power deliver architecture – we call it 2N+2. We have two patents on the technology and offer a 100% Availability SLA (service level agreement) with these configurations.

How can we be so confident that with our 2N+2 architecture your power will not go down? We have 2N redundancy on the power paths to your cage or rack and an N+2 distributed design for the critical elements in the power delivery system. Lastly, one of our patented inventions is a unique cross-facility power switching fabric and a massively scalable topology that allows us to move, share, and scale live IT power load throughout the data center without requiring a maintenance outage.

For data centers, the old adage definitely still applies: "You can’t manage what you don’t measure." Availability numbers are a great metric to manage your data center power delivery system. However, when choosing the right data center colocation solution, be sure to look for power delivery systems that are redundant, distributed, and scalable.

Data Center Emergency Preparedness: When too much is not enough.

One of the most important aspects of data center operations is risk management or mitigation. Data center operators typically operate with a proactive mentality in order to properly react to any given situation which ultimately reduces the risk exposure of the facility. Training, preventive maintenance, and regular system or equipment testing becomes second nature as these facilities are expected and do (for the most part) operate seamlessly 24x7x365.25 days a year; however, it’s the once in a while event which tests the true resiliency of the facility and pushes the operations staff to their limits.

An acute level of attention to detail and complete ownership of the facility are common characteristics demonstrated by our operations staff. The team works tirelessly to ensure that they are ready for any given scenario. Emergency preparedness checklists are created, inventories are taken, and procedures are created for most common scenarios for events which carry the greatest potential to take place within the facility; however, we often find that with all of our efforts dedicated to ensuring our preparedness… it’s not enough.

Recent events within the Northeast are bringing to light scenarios which the operations team may not be prepared to handle in order to ensure the continuous uptime of the critical mission. Real world examples are as follows: generator loss of fuel delivery requiring re-priming of the fuel system, emergency redistribution of proprietary electrical feeds at the rack level, unusual roof leaks, flooding, staffing relief plans, and communications challenges. When creating the emergency operating procedures or casualty control procedures, emphasis must be placed on scenarios whereas the staff must be able to react and focus on the fact that external help is not available.

One very real scenario that we should proactively run drills on is generator loss of prime and how to re-prime engines due to fuel flow issues or based on the need to change the fuel pre-filters. Many of us do not have paralleled pre-filter assemblies, which means that once fuel pressure starts dropping due to extended run time, the fuel pre-filters will have to be changed which increases the chances for a loss of fuel prime. Onsite staff must have the ability to change the fuel filters and must be trained on re-priming the engines particularly when help is not on the way. The only way to do this is to proactively train each member of the operations team through a hands-on approach. Along those same lines, true drills need to be run within the facility and critiques of each drill should be held in order to analyze how the staff performed and continually improve upon existing processes and procedures.

Remember, the data center operation and uptime is just as important as a continually operating nuclear power plant. Here at RagingWire, we employ many former nuclear power operators from the Navy and the civilian sector which couldn’t be better examples of critical facilities operators. We each bring to the table a dedicated critical mentality which is seen not only in our data center infrastructure design or operation but also in the way that we work together within the data center community. As we all continue to recover from the devastating conditions experienced over the last several days, RagingWire is providing unwavering support to our colleagues who are continuing to operate in these adverse conditions.

Impact of Hurricane Sandy in the US east coast

RagingWire’s Northern Virginia data center campus, located in Ashburn, sustained no damage from the hurricane and remained on utility power for the duration of Hurricane Sandy’s assault on the East coast. Our thoughts are with those who continue to recover from the storm and subsequent damage.

Blog Tags:

When N+1 just isn’t good enough

2006 was a pivotal year for RagingWire. 2006 was the year RagingWire learned that for data centers, N+1 just isn't good enough. 2006 is the year RagingWire went dark. It started normally enough – a beautiful spring day in April. During normal operations, a 4,000Amp breaker failed. Material failures happen, even with the best maintenance programs in place. Our UPS's took the load while the generators started – then the generators overloaded. The data center went dark.

After bringing the data center back online, we performed a detailed post-mortem review and identified the root causes of the outage to be design flaws and human error. Our management team declared that this could never, ever happen again. We knew that we needed to invest heavily in our people, and that we needed to rethink how data centers operate. We started with investing in our people because human error can overwhelm even the best of infrastructure designs. We focused our recruitment efforts in the nuclear energy industry and the navy nuclear engineering program – both working environments where downtime is not an option and process control, including operations and maintenance, is second nature. We hired a talented team and asked them to design and operate our data center to run like a nuclear sub.

Our revamped team of engineers determined  that the then-current N+1 design did not meet their requirements, so they changed it and implemented the concept of a 2N+2 design. Their work was recognized last week as RagingWire announced the issuance of Patent #8,212,401 for “redundant isolation and bypass of critical power equipment.” This is one of 2 patents that resulted from RagingWire’s outage in 2006 and our efforts to design a system that would never go down again.

RagingWire’s systems are built to a 2N+2 standard. RagingWire exceeds the Uptime Tier IV standard by providing fault tolerance during maintenance. We call this “fix one, break one” or FOBO. This means that any active component – UPS, generator, chiller, pump, fan, switchboard , etc. – can be removed from service for maintenance, any other active component can fail, AND we can experience a utility outage, all without loss of power or cooling to the server rack. Having this extra level of redundancy allows RagingWire to perform more maintenance, and to do so without worry about a loss in availability. This enables us provide a 100% uptime SLA, even during maintenance windows.

List of data center outagesLooking at the last year and a half, it’s clear that many data centers are still providing their customers an inferior N+1 design. How do you know? Simply look at the number of providers below who have suffered data center outages over the past 18 months. Since 2006, RagingWire has had 100% availability of its power and cooling infrastructure due to its superior 2N+2 design. If your current provider is still offering N+1, maybe it’s time to ask yourself if N+1 is still good enough for you.

October 22, 2012Amazon Web Services suffered an outage in one of its data centers that takes down multiple customers in its US-East-1 region.  The problem was attributed to a “small number” of storage volumes that were degraded or failed.

October 8, 2012 – A cable cut took down Alaska Airlines’ ticketing and reservation system, causing delays across the airlines’ operations and preventing customers from checking in for flights.

August 7, 2012 – A fiber cut takes nonprofit Wikipedia offline for an hour.

July 28, 2012Hosting.com powered off 1,100 customers due to human error during preventative maintenance on a UPS in their Newark, De data center.

July 10, 2012Level3 East London data center offline for 5 hours after a UPS bus-bar failed.

July 10, 2012Salesforce.com suffers worldwide outage after a power failure in one of Equinix’ Silicon Valley data centers.

June 29, 2012Amazon Web Services suffers a power outage in its Northern Virginia data center. Multiple generators failed to start automatically due to synchronization issues, and had to be started manually.

June 14, 2012Amazon Web Services suffers a power outage in its Northern Virginia data center. The problem was blamed on a defective generator cooling fan and a mis-configured power breaker.

June 13, 2012 – US Airways had a nationwide disruption of their computer system, affecting reservations, check-in and flight status due to a power outage at their AT&T data center in Phoenix.

January 20, 2012 – A power failure in Equinix SV4 data center took several customers including Zoho offline.

October 10, 2011Research in Motion cut Blackberry service to most of Europe for 6 hours due to a power failure in their Slough, UK data center. The outage caused service disruptions for 3 days worldwide.

August 11, 2011Colo4 in Dallas, TX failed an automatic transfer switch, resulting in a 6 hour power outage.

August 7, 2011 - Amazon Web Services Dublin, Ireland data center lost power due to a generator phase-synchronization error, disrupting service to the EU West region.

Pages

Subscribe to RSS - Infrastructure