Learn how to configure Kubernetes storage classes for provisioning and managing storage resources in Kubernetes clusters.

Kubernetes Storage Class

Kubernetes is the de-facto standard for container orchestration, and Kubernetes storage is a key component of modern application infrastructure.

Kubernetes storage classes enable the provisioning and management of storage resources in Kubernetes clusters. Storage classes define the type of storage, the storage provisioner responsible for the storage management, and any additional parameters required, such as the configuration of different storage performance, latency, or durability. The StorageClass resource is the entry point for creating storage classes in a Kubernetes Cluster.

This article will explore the key concepts of Kubernetes storage classes, including their types, how they relate to persistent volume claims and persistent volumes, and how to provide storage resources to applications in a Kubernetes cluster. We will also discuss Kubernetes storage class best practices that can help administrators more effectively manage storage resources.

Summary of key Kubernetes storage class concepts

The table below summarizes the key Kubernetes storage class concepts this article will explore in more detail.

Concept Description
Storage provisioners A plugin that provisions storage resources for a Kubernetes cluster
Persistent volume Manage durable storage in a Kubernetes cluster
Persistent volume claim A request for and claim to a PersistentVolume resource
Reclaim policies Policies that define what should happen to the persistent volume when it is no longer needed
Default storage class The storage class is used when no other storage class is specified

Kubernetes storage class overview

Understanding how the individual concepts connect is key to understanding Kubernetes storage class fundamentals.

A storage class consists of a storage provisioner, while a persistent volume claim (PVC) depends on having a persistent volume (PV) in place. A PVC uses the cluster default storage class; if multiple storage classes are defined, it is possible to configure which storage class to use. To better understand how they all work together, the illustration below explains the lifecycle and dependencies between each of the above from the perspective of both a developer and an administrator of a Kubernetes cluster.

An overview of the different Kubernetes storage concepts. (Source)

As we can see, a cluster administrator can take action and define the available storage classes for a Kubernetes cluster. These classes will be consumed by application developers such that actual storage can be provisioned and mounted to the filesystems of individual applications. Below, we go through the key concepts for Kubernetes storage classes.

Storage Provisioners

Storage provisioners are tools that provision storage resources for containers and applications running in a Kubernetes cluster. For example, suppose an application needs a storage volume to store its data. In that case, the provisioner can automatically allocate storage space from a cloud storage provider, network-attached storage (NAS) system, or a local disk on the host machine. AWS EBS, Azure Files, and GCE PD are all among the most popular Kubernetes storage provisioners.

The configuration below is an example configuration of a Kubernetes storage class manifest, which creates a StorageClass Kubernetes object and makes use of the AWS EBS storage provisioner:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  fsType: ext4 

Comprehensive Kubernetes cost monitoring & optimization

Note the specific annotation after the provisioner key. This corresponds to the specific provisioner you would like to configure. Some options include:

Volume Plugin Provisioner
AWS Elastic Block Store kubernetes.io/aws-ebs
GCE Persistent Disk kubernetes.io/gce-pd
Azure Files kubernetes.io/azure-file
External NFS Server example.com/external-nfs

In the manifest we defined above, a Storage Class is created for Kubernetes that specifies the kubernetes.io/aws-ebs provisioner and defines a StorageClass named gp2. The configuration in the parameters block depends on the specific provider and what options it offers. Different options translate to underlying storage performance, such as durability, number of input/output operations(IOPS), and throughput. The right choice depends on the requirements of a specific use case.

Note: If using a cloud external environment to create storage, such as with AWS EBS, the nodes hosting the Kubernetes cluster must have the necessary IAM permissions to attach and mount the EBS volumes. A similar approach should be taken with other cloud providers.

Persistent Volumes

Persistent volumes (PVs) are the actual storage volumes that can be accessed by applications running in a Kubernetes cluster. A storage administrator can provision them and then bind them to the applications that need them.

For example, an application running in a Kubernetes cluster might need a 10 GB volume to store its data. A storage administrator can create a 10 GB PV and bind it to the application using a PVC.

Persistent Volumes Claims

While a persistent volume represents the actual volume, an ongoing volume claim is a request for that volume. PVCs are called from pods requiring storage for their respective applications. In other words, a PVC is an additional abstraction layer over storage, and a pod cannot directly mount a PV. It needs a PVC.

When a PVC is requested, two of the most common options to include in the request are access mode and the requested capacity for the PVC. Access modes in a PVC specify how the claim can be accessed:

  • ReadWriteOnce(RWO): Allows for read-write from one node(or as many podes from the same node) at a time
  • ReadOnlyMany(ROX): Allows for read-only from many pods simultaneously
  • ReadWriteMany(RWX): Allows for read-write from many pods simultaneously
  • ReadWriteOncePod(RWOP): Allows for read-write from one specific pod only

A full example of the available options for PVC configuration can be seen in the official Kubernetes documentation.

Reclaim Policies

Reclaim policies determine what happens to the storage resources allocated to an application when the application is deleted or no longer requires the storage resources. The available options are Retain, Delete, and Recycle. For example, if an application is deleted, the following will happen to the underlying PV storage:

  • Retain: This policy will keep the data in the Persistent Volume.
  • Delete: This policy will delete the Persistent Volume.
  • Recycle: A deprecated option that is no longer suggested for production usage, which โ€œrm -rf /โ€s the contents of the Persistent Volume, making it available to be used for another PVC.

Default Storage Class

The default storage class is a Kubernetes concept that specifies the default storage class that will be used for applications that do not explicitly request a specific storage class in their persistent volume claim (PVC) manifest.

K8s clusters handling 10B daily API calls use Kubecost

A default storage class may be defined depending on the Kubernetes cluster's installation. However, it is generally up to the cluster administrator to define a default storage class. To find out if there is a default storage class in a Kubernetes cluster, run kubectl get storageclass. If the output includes an entry with (default) after the name, a default storage class already exists, like in the example below.

NAME                 PROVISIONER               AGE
standard (default)   kubernetes.io/gce-pd      1d
gold                 kubernetes.io/gce-pd      1d

Generally, defining a default storage class simplifies the process of creating PVCs by eliminating the need to specify a storage class for each PVC. It also provides a consistent way to provision storage across the cluster, ensuring that the same type of storage is used by default for all PVCs. Even if a default storage class is defined in the cluster, users can still explicitly specify a different storage class in their PVC manifest.

Four essential Kubernetes storage class best practices

The sections below cover four Kubernetes storage class best practices that can help administrators better manage their Kubernetes infrastructure.

Kubernetes storage class best practice #1: Monitor storage usage

By monitoring storage usage, you can determine your applications' required storage and plan for future growth or expansion. Monitoring storage usage can also help you optimize resource allocation by identifying underutilized storage and reclaiming it for other workloads. It can also help you avoid application outages caused by running out of storage capacity. By tracking usage trends, you can proactively prevent capacity-related issues before they occur.

Implementing a proper monitoring solution for Kubernetes can be time-consuming and complex. With custom solutions, this means you must essentially implement another application with a frontend, metrics storage, retention, and high availability. This can be very time-consuming, and computing and storage resources must be allocated. Additionally, administrators must define and configure access policies.

Custom reporting and alerting via channels such as Slack or email are some of the most common requirements for a Kubernetes storage monitoring solution. Kubecost Cloud makes it easy to package all of the pre-mentioned features and many more in an integrated SaaS application accessible from the web. The Kubecost sandbox monitoring environment is a very good entry point to start getting a feel of the tool. It allows you to explore all the available features as if you had a real Kubernetes cluster in place, giving a very clear breakdown of costs and multiple configuration options, including the availability to create custom reports and implement alerts on predefined thresholds for different objects and usage in the Kubernetes cluster.

A Kubecost cost dashboard. (Source)

Kubernetes storage class best practice: #2 Use multiple storage classes to optimize costs

Kubernetes allows multiple storage classes to be defined and used to provide different storage options for different applications and workloads. This introduces excellent flexibility because with multiple storage classes, you can use faster storage with low latency for your database, but slower storage with higher capacity for your backups.

Learn how to manage K8s costs via the Kubecost APIs

Having multiple storage classes can additionally help with cost optimizations. For example, you may use lower-cost, slower storage for development or testing environments but higher-cost, faster storage for production environments. This aligns well with monitoring storage usage across the Kubernetes cluster. As mentioned in the previous section, Kubecost Cloud makes it relatively easy to start monitoring usage for storage. For example, administrators can use the assets dashboard to start building reports with dynamic date ranges, aggregations, or custom Kubernetes labels you can assign to your resources.

A Kubecost asses dashboard. (Source)

Finally, high availability is another benefit of having multiple storage classes. High availability and redundancy are direct outcomes of having multiple storage classes because they allow you to store data across different storage backends, availability zones, or even multiple cloud providers.

However, having multiple storage classes might lead to additional complexity, making clusters more challenging to manage, but with strategic implementation, this risk can be mitigated. Proper change management, technical documentation, and integration testing are all good steps to ensure system robustness.

Kubernetes storage class best practice #3: Implement secure access control policies

Generally, access control policies in Kubernetes help enforce security by controlling access to resources and ensuring that only authorized users or services can perform specific actions.

Administrators can use Kubernetes RBAC to implement granular access control. For example, a Kubernetes administrator might define a "storage-admin" role that has full access to all storage resources, a "developer" role that has read and write access to certain storage resources, and a "viewer" role that has read-only access to some resources. This can be extended in the following hypothetical security model:

  • Allow "storage-admin" users or services to create or modify a "high-performance" storage class.
  • Allow "developer" and "viewer" users or services to create or modify the "low-performance" storage class.

These policies build a strong safety net that helps prevent security breaches and data leaks. Access control policies can also help organizations meet regulatory compliance requirements by enforcing security controls, restricting access to sensitive data or resources, and enforcing governance policies by limiting access to certain resources or enforcing specific workflows.

It is also worth adding that access control policies can require additional administrative overhead, such as managing user and service accounts, defining roles and permissions, and monitoring policy compliance. In summary, access control policies in Kubernetes are generally a good practice for ensuring security, compliance, and governance. It is important to carefully design and implement access control policies that balance security requirements with user needs and organizational goals.

Kubernetes storage class best practice #4: Use reclaim policies to increase efficiency

Reclaim policies in Kubernetes determine what should happen to the storage resources associated with a PV when the associated PVC is deleted.

Reclaim policies can help optimize storage resources by freeing up unused resources when a PVC is deleted. This can reduce costs and prevent resource waste. They can also help to ensure data security by securely deleting or sanitizing data when a PVC is deleted, ensuring that sensitive data is not left behind on the storage device. Operational efficiency is also achieved with reclaim policies because the process of deleting unused storage resources is being automated with manual intervention almost reduced in its entirety.

Reclaim policies must be well designed and tested because if they are not correctly designed and implemented, data loss can occur, which can be catastrophic for an organization. In summary, reclaim policies are generally a good practice for optimizing resource utilization, improving data security, and automating operations.

Conclusion

Overall, understanding Kubernetes storage classes is essential for any Kubernetes user or administrator looking to deploy and manage applications on a Kubernetes cluster. By leveraging storage classes and different storage provisioners, users can simplify managing storage resources in Kubernetes, allowing them to focus on building and scaling their applications.

Administrators can optimize for storage cost across applications deployed in a Kubernetes cluster. Monitoring storage usage with managed monitoring software like Kubecost Cloud can help visualize costs and resource allocations in a cluster and alert if predefined thresholds are breached. With the proper configuration and management, Kubernetes storage classes can provide a scalable, efficient, and cost-effective way to manage storage resources for applications in Kubernetes clusters.

Comprehensive Kubernetes cost monitoring & optimization

Continue reading this series