Five resilience, availability, and data protection principles for Kubernetes

Published: Thursday, 14 January 2021 09:38

Kubernetes use is growing rapidly, a trend which is expected to accelerate. In this article, Gijsbert Janssen van Doorn looks at some of the related resilience, availability, and data protection issues and provides some helpful tips.

In recent years, the technologies used for running workloads have developed rapidly, with the popularity of containers of significant note, given their lightweight, modular approach. Indeed, research suggests that during the next two years, the adoption of containers will continue to increase and are “expected to become the top choice for production deployment”. In fact, more than two-thirds of respondents are already in production with container environments, according to the same survey.

The success of the container market comes down to a range of factors, not least that containers offer a robust and complete ecosystem with platform software from monitoring to storage and networking, to configuration management and data protection.

But looking more closely, when it comes to security and data protection in particular, organizations often find Kubernetes difficult to control because legacy tools and processes simply don’t meet its requirements as a cloud-native platform. This requires IT teams consider a number of issues in order to maximise data protection around Kubernetes. Five of the most important include:

Balancing availability and resilience with development speed

Being able to protect and recover containers without adding more steps, tools, and policies to the DevOps process is a vital part of a holistic approach to successful Kubernetes deployment. But balancing availability and resilience against the need to ensure effective development speed across enterprise applications and services is difficult to achieve without the flexibility of a native platform designed to work with container platforms such as Kubernetes. 

For instance, any application downtime and data loss can present major issues, especially those that are containerised. But by employing native data protection technologies that promote ‘data protection as code’ means that data protection and disaster recovery operations are integrated into the application development lifecycle from the beginning. In doing so application resilience can be assured without any negative impact on the speed, scale and agility of containerised applications. 

Continuous Data Protection (CDP) technology adds to the strategy by offering users the ability to easily ‘rewind’ to a previous checkpoint, ensuring a low recovery point objective (RPO). This is not only minimally disruptive, but delivers greater flexibility and availability than traditional backup, where snapshots can be many hours behind production systems, leaving gaps in data protection.

Pipeline protection

Even though container images act as permanent layers of the installation and configuration process, it’s also a good idea to protect the technology producing the images, or the ‘pipeline’, as it’s known. This includes components such as configuration scripts (such as Dockerfiles and Kubernetes YAML files) and documentation.

The problem that often arises, however, is that the data protection requirements for the systems that create the containers, such as build servers and code and artifact repositories that store containers and application releases, are overlooked. But, as important stages in the Continuous Integration and Continuous Delivery pipeline, protecting these workloads also ensures that most of the pipeline that produces container images is also more effectively protected.

Protecting persistent application data

As container development has accelerated, technologies such as Kubernetes now support a diverse variety of workloads, including stateful applications. For instance, even though container images are transitory, and any file system changes are lost after the running container is deleted, users now have various options for adding stateful, persistent, storage to a container. 

Data protection strategy - and the choice of platform - must operate with these capabilities front of mind. This remains true even for existing on-premises enterprise storage arrays that can often provide stateful storage to Kubernetes clusters. 

Accessing and managing cloud services

Using cloud services for object or file storage offers a solution that’s both easy and quick to implement and manage. The problem is that they exist outside of the control of the people in charge of data protection, and as a result, this ‘invisible’ persistent storage can result in unprotected and insecure data that exists without backup, disaster recovery and application mobility. 

The answer lies in delivering a consistent approach to accessing and managing cloud storage so developers can use the services they need while their colleagues can maintain oversight, security, and overall responsibility for data protection.

Avoiding vendor lock-in

Another essential underlying component in building a data protection strategy around Kubernetes is the need to avoid vendor lock-in. Data protection should support each enterprise Kubernetes platform and move data where the application needs it without any lock-in to a specific storage platform or cloud vendor.

By prioritising data protection without compromising the freedom Kubernetes gives developers, teams can more easily protect, recover, and move applications to optimise intelligent data management and accelerated software development. In an environment where software development and data protection should always work hand-in-hand, this offers a win-win for technology specialists and the wider business alike.

The author

Gijsbert Janssen van Doorn is Director Technical Marketing for Zerto.