Rook v1.6: Storage Enhancements

Sébastien Han
Rook Blog
Published in
5 min readApr 16, 2021

--

Rook v1.6 is another feature-filled release to improve storage for Kubernetes. The community is a great support in this journey to help validate that Rook will meet and exceed requirements for running in production. The statistics continue to show Rook community growth since the v1.5 release in November:

  • 8K to 8.5K Github stars
  • 183M to 216M Container downloads
  • 5K to 5.7K Twitter followers
  • 3.8K to 4.2K Slack members

With the v1.6 release, we have plenty of new features primarily for the Ceph storage provider that we hope you’ll be excited about as well!

Virtual KubeCon EU

We are looking forward to Virtual KubeCon EU, May 4–7th. Tune in to the Rook sessions!

For questions or discussions during the conference, we will plan on being active in the Rook Slack. We hope to “see” you there!

Ceph

New Ceph Pacific support

After a year the wait is over for the next major stable Ceph release, and Rook is ready to consume it. Some of the highlights from this release include:

  • QoS between client I/O and background operations
  • Dual stack networking support
  • Filesystem mirroring
  • Multiple file systems in a single Ceph cluster are now stable.

For more details, refer to the Pacific release announcement.

Networking

A few releases back we announced initial support for Multus CNI (Container Network Interface), allowing attaching multiple network interfaces to pods in Kubernetes. We marked it as experimental for some time. Today we are proud to announce that Multus support is considered stable. The effort spans across multiple components (Rook, Ceph, and Ceph-CSI).

On a related networking note, dual-stack is now supported. This means that Ceph daemons will listen and respond on both IPv4 and IPv6 stacks. Also, a proper single stack is enforced.

Encryption

Rook v1.5 brought the initial integration of a Key Management System with HaschiCorp Vault, with the encryption of Ceph OSDs (storage daemons) and support of the KV version 2 secret engine. Now the Ceph Object Gateway (RGW) has received the same level of support to store encrypted objects while storing its encryption key in Vault. The Object Gateway encryption also supports both KV and Transit Vault Secret engines.

OSD Improvements

First, we are excited that there is no more requirement for LVM when creating new OSDs. This change has been a long time coming to remove this LVM layer and simplify the OSD maintenance. Rook will directly consume the raw block device when deploying simple new OSDs. There is no more LVM. Hooray!

Second, OSDs can be updated much more efficiently now by failure domain for large clusters (Pacific only). In clusters with large numbers of OSDs, previously it could take a very long time to update all of the OSDs, whether for Rook or Ceph major or minor updates. To better support large clusters, Rook is will now update multiple OSDs in parallel.

Finally, Drive Groups had been added in Rook release v1.4 as another method for specifying how to configure OSDs. We had hoped this would make integration with the Ceph orchestrator framework easier for Rook. In practice, we found that this merely added complexity that stretched thin our development and maintenance efforts. Thus we have decided to remove the drive groups from Rook v1.6.

Multiple Mgr Daemons

Until now, only a single mgr daemon could be deployed by Rook. For scenarios where the mgr downtime is more critical, a second mgr can now be started with a simple setting in the cluster CR. This is particularly recommended in stretched deployments where the cluster spans across two data centers. If a ceph-mgr goes down, Ceph and Rook will automatically update the active mgr and corresponding services for improved availability during partial cluster downtime.

Pod Disruption Budgets (PDBs)

Ensuring cluster availability while updating underlying infrastructure is an important property of a storage system. In v1.5 and v1.6, we have made significant improvements to the handling of OSD and Mon PDBs. By default, the PDBs will now be enabled as another way to ensure your data availability remains high.

Ceph Filesystem Mirroring (experimental)

CephFS supports asynchronous replication of snapshots to a remote CephFS file system via the cephfs-mirror tool. Snapshots are synchronized by mirroring snapshot data and creating a snapshot with the same name (for a given directory on the remote file system) as the snapshot is synchronized.

A “CephFilesystemMirror” CRD is introduced to bootstrap the new cephfs-mirror daemon. Similar to rbd-mirror, peers can be added and configured for mirroring targets. Also, filesystems have a new property that enables mirroring.

Ceph-CSI

Rook 1.6 ships with the latest Ceph-CSI 3.3.0 driver configured. Some of the highlights include:

  • Asynchronous Disaster Recovery: A new volume replication controller to achieve volume replication
  • Encryption: AWS Key Management System added for Ceph-CSI volume encryption.
  • Multus Support

More CI (continuous integration) goodness

We continue to invest in our test infrastructure to ensure each release is of high quality. The transition to fully consume Github Actions continues to increase our test coverage for Rook and Ceph across supported K8s versions. With Github Actions, we have reduced our CI running time and improved the tests to be more stable and responsive. The team believes that contributions, reviews and merges have improved significantly. We are hoping to complete our transition to Github actions and shut down our Jenkins instance during the 1.7 timeframe.

Rook now builds on Golang 1.16 and supports both 1.15 and 1.16. Golang 1.16 comes with some nifty features, such as the ability to embed resources that we are looking forward to consuming.

Storage Providers Removed

Rook storage providers are driven by community involvement. Each one requires specialized knowledge to develop and maintain. For storage providers that do not have sufficient community support, we have come to the difficult decision to remove them from the project. Three storage providers are now removed from Rook in v1.6:

  • EdgeFS
  • CockroachDB
  • YugaybteDB

We continue to welcome the community to grow and contribute! Community support is critical to each of the active storage providers.

What’s Next?

As we continue the journey to develop reliable storage operators for Kubernetes, we look forward to your ongoing feedback. Only with the community is it possible to continue this fantastic momentum.

There are many different ways to get involved in the Rook project, whether as a user or developer. Please join us in helping the project continue to grow on its way beyond the v1.6 milestone!

co-author: Travis Nielsen

--

--