-
Notifications
You must be signed in to change notification settings - Fork 929
Change Summary v1.1
-
Support for the upgrade of OpenEBS storage pools and volumes through Kubernetes Job. As a user, you no longer have to download scripts to upgrade from 1.0 to 1.1, like in earlier releases. The details of how to trigger upgrades via Kubernetes Job are provided here. Kubernetes Job-based upgrade is a step towards completely automating the upgrades in the upcoming releases. Would love to hear your feedback on the proposed design.
Note: Upgrade job makes uses of a new container image called
quay.io/openebs/m-upgrade:1.1.0
. Related PRs: https://github.com/openebs/openebs/pull/2668, https://github.com/openebs/maya/pull/1378, https://github.com/openebs/maya/pull/1374, https://github.com/openebs/maya/pull/1367, https://github.com/openebs/maya/pull/1354 -
Support for an alpha version of CSI driver with limited functionality for provisioning and de-provisioning of cStor volumes. Once you have OpenEBS 1.1 installed, you can take it for a spin on your development clusters using the instructions provided here. CSI driver also requires a shift in the paradigm of how the configuration of the storage class parameters should be passed on to the drivers. We want to keep this seamless, please let us know if you have any inputs on what you notice as some of the nice to have as we shift towards the CSI driver.
Note: This feature is under active development and considered Alpha. To test this feature you need Kubernetes version 1.14 or higher. More details can be found here. Related PRs: https://github.com/openebs/csi/pull/18, https://github.com/openebs/csi/pull/10, https://github.com/openebs/maya/pull/1316/
- Enhanced Prometheus metrics exported by Jiva for identifying whether an iSCSI Initiator is connected to Jiva target. https://github.com/openebs/maya/pull/1340, https://github.com/openebs/jiva/pull/233, https://github.com/openebs/gotgt/pull/22
- Enhanced NDM operator capabilities for handling NDM CRDs installation and upgrade. Earlier this process was handled through maya-apiserver. https://github.com/openebs/node-disk-manager/pull/281, https://github.com/openebs/node-disk-manager/pull/277, https://github.com/openebs/maya/pull/1346
- Enhanced Disk and BD CR specifications by changing the value of
kubernetes.io/hostname
label fromnodeName
tohostname
. In some cloud providers like AWS,nodeName
andhostname
are different. This has resulted in NDM cleanup jobs not getting scheduled on the node. https://github.com/openebs/node-disk-manager/pull/290 - Enhanced velero-plugin to take backup based on the
openebs.io/cas-type:cstor
and it will skip backup for unsupported volumes(or storage providers). https://github.com/openebs/velero-plugin/pull/28 - Enhanced velero-plugin to allow users to specify a backupPathPrefix for storing the volume snapshots in a custom location. This allows users to save/backup configuration and volume snapshot data under the same location. As opposed to saving the configuration and data in different locations for the same backup operation. https://github.com/openebs/velero-plugin/pull/25
- Added an ENV flag which can be used to disable default storage config creation as the customization done to default storage classes will be overwritten by OpenEBS API Server. The recommended approach for customizing is to create their own storage configuration using the default options as examples/guidance. Note that for creating default cStor Sparse Pool, both the default storage config and sparse pool creation config flags should be enabled. https://github.com/openebs/maya/pull/1352
- Enhanced cStor custom resources like CV, CSP and CVR by adding more fields like
LastUpdatedTime
,LastTransitionTime
. This will help in troubleshooting cases where the cStor pool pods go offline and the status on the CV, CSP and CVR will show the stale status of being online. https://github.com/openebs/maya/pull/1329
- Fixes an issue where rebuilding cStor volume replica failed if the cStor volume capacity was changed after the initial provisioning of the cStor volume. https://github.com/openebs/libcstor/pull/16, https://github.com/openebs/cstor/pull/261, https://github.com/openebs/istgt/pull/264
- Fixes an issue with cStor snapshot taken during the transition of replica's rebuild status. https://github.com/openebs/libcstor/pull/20, https://github.com/openebs/cstor/pull/250, https://github.com/openebs/istgt/pull/265
- Fixes an issue in Jiva where application file system was breaking due to the deletion of auto-generated Jiva snapshots. https://github.com/openebs/jiva/pull/231, https://github.com/openebs/jiva/pull/232, https://github.com/openebs/jiva/pull/234
- Fixes an issue where NDM pod was getting restarted while probing for details from the devices that had write cache supported. https://github.com/openebs/node-disk-manager/pull/276
- Fixes an issue in NDM where Seachest probe was holding open file descriptors to LVM devices and LVM devices were unable to detach from the Node due to NDM hold on device. https://github.com/openebs/node-disk-manager/pull/275.
- Fixes a bug where BD can be claimed, even if BDC is scheduled for deletion. https://github.com/openebs/node-disk-manager/pull/289
- Fixes a bug where the backup was failing if
openebs
was installed through helm. velero-plugin was checkingmaya-apiserver
name and it was different when you have installed via helm based method. Updated velero-plugin to check the label ofmaya-apiserver
service name. https://github.com/openebs/velero-plugin/pull/21, https://github.com/helm/charts/pull/16011
From 1.0.0: None. The upgrade scripts take care of migrating the existing custom resources to the new format. For previous releases, please refer to the respective release notes and upgrade steps.
Note: As part of OpenEBS upgrade or installation, maya-apiserver
pod will restart if NDM blockdevice CRDs are not created before the creation of maya-apiserver
. https://github.com/openebs/maya/pull/1381
The recommended steps to uninstall are:
- delete all the OpenEBS PVCs that were created
- delete all the SPCs (in case of cStor)
- ensure that no volume or pool pods are pending in terminating state
kubectl get pods -n <openebs namespace>
- ensure that no openebs cStor volume custom resources are present
kubectl get cvr -n <openebs namespace>
- delete all openebs related StorageClasses.
- delete the openebs either via
helm purge
orkubectl delete
Uninstalling OpenEBS doesn't automatically delete the CRDs that were created. If you would like to remove CRDs and the associated objects completely, run the following commands:
kubectl delete crd castemplates.openebs.io
kubectl delete crd cstorpools.openebs.io
kubectl delete crd cstorvolumereplicas.openebs.io
kubectl delete crd cstorvolumes.openebs.io
kubectl delete crd runtasks.openebs.io
kubectl delete crd storagepoolclaims.openebs.io
kubectl delete crd storagepools.openebs.io
kubectl delete crd volumesnapshotdatas.volumesnapshot.external-storage.k8s.io
kubectl delete crd volumesnapshots.volumesnapshot.external-storage.k8s.io
kubectl delete crd disks.openebs.io
kubectl delete crd blockdevices.openebs.io
kubectl delete crd blockdeviceclaims.openebs.io
kubectl delete crd cstorbackups.openebs.io
kubectl delete crd cstorrestores.openebs.io
kubectl delete crd cstorcompletedbackups.openebs.io
kubectl delete crd cstorpoolclusters.openebs.io
Note: As part of deleting the Jiva Volumes - OpenEBS launches scrub jobs for clearing data from the nodes. The completed jobs need to be cleared using the following command:
kubectl delete jobs -l openebs.io/cas-type=jiva -n <namespace>
- The current version of OpenEBS volumes are not optimized for performance-sensitive applications.
- For taking the backup of cStor volume with OpenEBS velero-plugin, openebs must be installed in openebs namespace.
- If a pending PVC related to
openebs-device
StorageClass is deleted, there are chances of getting stale BDCs which ends up in consuming BDs. You have to manually delete the BDC to reclaim it. - In OpenShift 3.10 or above, NDM daemon set pods and NDM operators will not be upgraded if NDM daemon set's DESIRED count is not equal to the CURRENT count. This may happen if nodeSelectors have been used to deploy OpenEBS related pods OR if master/other nodes have been tainted in the k8s cluster.
- Jiva Controller and Replica pods are stuck in
Terminating
state when any instability with the node or network happens and the only way to remove those containers is by usingdocker rm -f
on the node. https://github.com/openebs/openebs/issues/2675 - cStor Target or Pool pods can at times be stuck in a
Terminating
state. They will need to be manually cleaned up using kubectl delete with 0 sec grace period. Example:kubectl delete deploy -n openebs --force --grace-period=0
- cStor pool pods can consume more memory when there is continuous load. This can cross the memory limit and cause pod evictions. It is recommended that you create cStor pools by setting the Memory limits and requests.
- Jiva Volumes are not recommended if your use case requires snapshots and clone capabilities.
- Jiva Replicas use sparse file to store the data. When the application causes too many fragments (extents) to be created on the sparse file, the replica restart can cause replica to take longer time to get attached to the target. This issue was seen when there were 31K fragments created.
- Volume Snapshots are dependent on the functionality provided by Kubernetes. The support is currently alpha. The only operations supported are:
- Create Snapshot, Delete Snapshot and Clone from a Snapshot. Creation of the Snapshot uses a reconciliation loop, which would mean that a Create Snapshot operation will be retried on failure until the Snapshot has been successfully created. This may not be a desirable option in cases where Point in Time snapshots are expected.
- If you are using K8s version earlier than 1.12, in certain cases, it will be observed that when the node hosting the target pod is offline, the target pod can take more than 120 seconds to get rescheduled. This is because target pods are configured with Tolerations based on the Node Condition, and TaintNodesByCondition is available only from K8s 1.12. If running an earlier version, you may have to enable the alpha gate for TaintNodesByCondition. If there is an active load on the volume when the target pod goes offline, the volume will be marked as read-only.
- If you are using K8s version 1.13 or later, that includes the checks on ephemeral storage limits on the Pods, there is a chance that OpenEBS cStor and Jiva pods can get evicted - because there are no ephemeral requests specified. To avoid this issue, you can specify the ephemeral storage requests in the storage class or storage pool claim. (https://github.com/openebs/openebs/issues/2294)
- When the disks used by a cStor Pool are detached and reattached, the cStor Pool may miss detecting this event in certain scenarios. Manual intervention may be required to bring the cStor Pool online. (https://github.com/openebs/openebs/issues/2363)
- When the underlying disks used by cStor or Jiva volumes are under disk pressure due to heavy IO load, and if the Replicas take longer than 60 seconds to process the IO, the Volumes will get into Read-Only state. In 0.8.1, logs have been added to the cStor and Jiva replicas to indicate if IO has longer latency. (https://github.com/openebs/openebs/issues/2337)