Skip to content

Change Summary v1.0

Kiran Mova edited this page Jun 26, 2019 · 18 revisions

Status : Completed ( 22-June-2019 )

New Capabilities and Enhancements

  • A new cluster level component called NDM Operator (built using Operator-SDK) is added to manage the life cycle of the block devices attached to the nodes. The Disk CR has been now renamed to Block Device - to make it more generic to represent all types of storage devices attached to the node. The NDM Operator - manages the access to the Block Devices. The consumers of the Block Devices like cstor-operator, local pv provisioner - can request for a Block Device using a Block Device Claim. The NDM Operator manages the life-cycle of the Block Devices from - binding to BD to a BDC, to cleaning up the data from the released BD. The contents of the Block Device are similar to what the Disk CR used to contain with few additional details. A sample Block Device CR looks like:

    ---
    apiVersion: openebs.io/v1alpha1
    kind: BlockDevice
    metadata:
      creationTimestamp: 2019-06-18T14:13:17Z
      generation: 1
      labels:
        kubernetes.io/hostname: gke-kmova-helm-default-pool-76aced15-hs76
        ndm.io/blockdevice-type: blockdevice
        ndm.io/managed: "true"
      name: blockdevice-34c0d4b9914fae53486dfced120dde57
      namespace: openebs
      resourceVersion: "2519"
      selfLink: /apis/openebs.io/v1alpha1/namespaces/openebs/blockdevices/blockdevice-34c0d4b9914fae53486dfced120dde57
      uid: 380598c4-91d3-11e9-9b40-42010a80011b
    spec:
      capacity:
        logicalSectorSize: 4096
        physicalSectorSize: 4096
        storage: 402653184000
      details:
        compliance: SPC-4
        deviceType: SSD
        firmwareRevision: ""
        model: EphemeralDisk
        serial: local-ssd-0
        vendor: Google
      devlinks:
      - kind: by-id
        links:
        - /dev/disk/by-id/scsi-0Google_EphemeralDisk_local-ssd-0
        - /dev/disk/by-id/google-local-ssd-0
      - kind: by-path
        links:
        - /dev/disk/by-path/pci-0000:00:04.0-scsi-0:0:1:0
      filesystem:
        fsType: ext4
        mountPoint: /mnt/disks/ssd0
      partitioned: "No"
      path: /dev/sdb
    status:
      claimState: Unclaimed
      state: Active
    ---

    This feature involved the following enhancements:

    • Support for NDM Operator.
    • Support for New CRs - Block Device (BD) and Block Device Claim (BDC).
    • Create and update Block Devices on startup or when there is a device attach / detach udev event.
    • Include details like filesystem and mountpoint associated with a given Block Device.
    • NDM Operator will support selecting the Block Devices for a given BDC with the following options:
      • a specific BD can be claimed by passing the BD name (or)
      • a BD that matches the capacity and hostname constraints.
      • optionally, a BDC can also specify if the BD should have a filesystem / mountpoint already created.
    • Update the cstor-operator to use the Block Devices. cStor Operator will parse through the available block devices and claim the specific BDs it needs to use.
    • Changed the SPC (cStor Storage Pool Claim) schema to use block devices instead of disks.

    Note: Breaking Change The custom resource (Disk) used in earlier releases has been changed to Block Device. If you are using Disk CR in your automation tools or playbooks, please replace them with Block Devices. In the upcoming releases, Disk CR will be restricted to represent only a physical disk.

  • Support for using Block Devices for OpenEBS Local PV. (https://github.com/openebs/maya/pull/1266, https://github.com/openebs/maya/pull/1292). OpenEBS Local PV provisioner uses the NDM to request for a Block Device and create a Local PV using the details provided in the Block Device. The following types of Block Devices can be used for creating Local PVs:

    • The storage devices attached to the nodes are formatted and mounted. For example: GKE with Local SSD
    • The storage devices are only attached to nodes, but not formatted. For example: GKE with GPD
    • The storage devices are actual virtual devices attached to nodes. For Example: VM with VMDK disks.

    A default storage class to use the local devices is available with the OpenEBS installation.

    ---
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: openebs-device
      annotations:
        #Define a new CAS Type called `local`
        #which indicates that Data is stored 
        #directly onto hostpath. The hostpath can be:
        #- device (as block or mounted path)
        #- hostpath (sub directory on OS or mounted path)
        openebs.io/cas-type: local
        cas.openebs.io/config: |
          - name: StorageType
            value: "device"
    provisioner: openebs.io/local
    volumeBindingMode: WaitForFirstConsumer
    reclaimPolicy: Delete
    ---

    When user creates a Local PV with StorageType as device, the Local PV provisioner will issue a BDC request. Once the NDM Operator attaches a matching BD to the BDC, the Local PV provisioner will fetch the path details from the attached BD and creates a Local PV.

    Note: The Local PV Provisioner will only find a device attached on the node that is assigned to Pod by Kubernetes scheduler. If you have local devices available only on certain nodes, modify the application spec to schedule the Pods to those specific nodes.

    Note: The Local Volumes don't enforce a capacity limit on the application pods. The Pods can use-up all the capacity available on the underlying device.

  • Enhanced cStor Data Engine to allow interoperability of cStor Replicas across different versions. (https://github.com/openebs/libcstor/pull/12).

  • Enhanced cStor Data Engine containers to contain troubleshooting utilities. (https://github.com/openebs/istgt/pull/252, https://github.com/openebs/cstor/pull/247, https://github.com/openebs/cstor/pull/245)

  • Enhanced prometheus metrics exported by cStor Pools to include details of the provisioning errors. (https://github.com/openebs/maya/pull/1252, https://github.com/openebs/maya/pull/1261)

Major Bugs Fixed

Specification Changes

The following changes have been made to the OpenEBS Customer CRs in v1.0:

  • Disk CR has been deprecated and in place of it, Block Device CR is used. All the details available in the Disk CR are now available with Block Device CR.
  • cStor Pools Definition - StoragePoolClaim (SPC) CR has been modified to use the Block Device CRs instead of the Disk CRs.
  • cStor Pools will use cStor Storage Pool to represent the cStor Pool Details - as opposed to earlier StoragePool CR. StoragePool CR is now only used by the Jiva Volumes.

Backward Incompatibilities

From 0.9: None. The upgrade scripts take care of migrating the existing custom resources to the new format.

For previous releases, please refer to the respective release notes and upgrade steps.

Upgrade

Upgrade to 1.0 is supported only from 0.9 and follows a similar approach like earlier releases.

  • Upgrade OpenEBS Control Plane components. This involves a pre-upgrade step.
  • Upgrade Jiva PVs to 1.0, one at a time
  • Upgrade CStor Pools to 1.0 and its associated Volumes, one at a time.

The detailed steps are provided here.

For upgrading from releases prior to 0.9, please refer to the respective release upgrade here.

Uninstall

The recommended steps to uninstall are:

  • delete all the OpenEBS PVCs that were created
  • delete all the SPCs (in case of cStor)
  • ensure that no volume or pool pods are pending in terminating state kubectl get pods -n <openebs namespace>
  • ensure that no openebs cstor volume custom resources are present kubectl get cvr -n <openebs namespace>
  • delete the openebs either via helm purge or kubectl delete
  • Uninstalling the OpenEBS doesn't automatically delete the CRDs that were created. If you would like to complete remove the CRDs and the associated objects, run the following commands:
    kubectl delete crd castemplates.openebs.io
    kubectl delete crd cstorpools.openebs.io
    kubectl delete crd cstorvolumereplicas.openebs.io
    kubectl delete crd cstorvolumes.openebs.io
    kubectl delete crd runtasks.openebs.io
    kubectl delete crd storagepoolclaims.openebs.io
    kubectl delete crd storagepools.openebs.io
    kubectl delete crd volumesnapshotdatas.volumesnapshot.external-storage.k8s.io
    kubectl delete crd volumesnapshots.volumesnapshot.external-storage.k8s.io
    kubectl delete crd disks.openebs.io
    kubectl delete crd blockdevices.openebs.io
    kubectl delete crd blockdeviceclaims.openebs.io
    kubectl delete crd cstorbackups.openebs.io
    kubectl delete crd cstorrestores.openebs.io
    kubectl delete crd cstorcompletedbackups.openebs.io
    kubectl delete crd cstorpoolclusters.openebs.io

Note: As part of deleting the Jiva Volumes - OpenEBS launches scrub jobs for clearing the data from the nodes. The completed jobs need to be cleared using the following command: kubectl delete jobs -l openebs.io/cas-type=jiva -n <namespace>

Limitations / Known Issues

  • The current version of the OpenEBS volumes are not optimized for performance sensitive applications
  • cStor Target or Pool pods can at times be stuck in a Terminating state. They will need to be manually cleaned up using kubectl delete with 0 sec grace period. Example: kubectl delete deploy -n openebs --force --grace-period=0
  • cStor Pool pods can consume more Memory when there is continuous load. This can cross memory limit and cause pod evictions. It is recommended that you create the cStor pools by setting the Memory limits and requests.
  • Jiva Volumes are not recommended if your use case requires snapshots and clone capabilities.
  • Jiva Replicas use sparse file to store the data. When the application causes too many fragments (extents) to be created on the sparse file, the replica restart can cause replica to take longer time to get attached to the target. This issue was seen when there were 31K fragments created.
  • Volume Snapshots are dependent on the functionality provided by Kubernetes. The support is currently alpha. The only operations supported are:
    • Create Snapshot, Delete Snapshot and Clone from a Snapshot. Creation of the Snapshot uses a reconciliation loop, which would mean that a Create Snapshot operation will be retried on failure, till the Snapshot has been successfully created. This may not be a desirable option in cases where Point in Time snapshots are expected.
  • If you are using K8s version earlier than 1.12, in certain cases, it will be observed that when the node hosting the target pod is offline, the target pod can take more than 120 seconds to get rescheduled. This is because target pods are configured with Tolerations based on the Node Condition, and TaintNodesByCondition is available only from K8s 1.12. If running earlier version, you may have to enable the alpha gate for TaintNodesByCondition. If there is an active load on the volume when the target pod goes offline, the volume will be marked as read-only.
  • If you are using K8s version 1.13 or later, that includes the checks on ephemeral storage limits on the Pods, there is a chance that OpenEBS cstor and jiva pods can get evicted - because there are no ephemeral requests specified. To avoid this issue, you can specify the ephemeral storage requests in the storage class or storage pool claim. (https://github.com/openebs/openebs/issues/2294)
  • When disks used by a cStor Pool are detached and reattached, the cStor Pool may miss to detect this event in certain scenarios. A manual intervention may be required to bring the cStor Pool online. (https://github.com/openebs/openebs/issues/2363)
  • When the underlying disks used by cStor or Jiva volumes are under disk pressure due to heavy IO load, and if the Replicas taken longer than 60 seconds to process the IO, the Volumes will get into Read-Only state. In 0.8.1, logs have been added to the cstor and jiva replicas to indicate if IO has longer latency. (https://github.com/openebs/openebs/issues/2337)