Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deleteOrphanPvc deleted PVC in use #32

Closed
chn217 opened this issue Mar 26, 2023 · 3 comments · Fixed by #67
Closed

deleteOrphanPvc deleted PVC in use #32

chn217 opened this issue Mar 26, 2023 · 3 comments · Fixed by #67

Comments

@chn217
Copy link

chn217 commented Mar 26, 2023

I'm trying to create a Druid cluster using druid-operator on AWS EKS. I'm using EBS GP2 for the persistent volume.

When trying to scale up the historical pods (e.g. 4 to 8), the first pod stuck in pending, and the rest 7 pods working fine. The first pvc was mistakenly deleted as orphan PVC even though it is still in use.

druid-operator log:
1.6798315940261655e+09 INFO druid_operator_handler Deleted orphaned pvc [data-volume-druid-workload-historicals-4:default] successfully {"name": "workload", "namespace": "default"}
1.679831594026486e+09 DEBUG events Normal {"object": {"kind":"Druid","namespace":"default","name":"workload","uid":"2c6b92b9-73cb-408f-a670-a3ee7fc307ff","apiVersion":"druid.apache.org/v1alpha1","resourceVersion":"3088566"}, "reason": "DruidOperatorDeleteSuccess", "message": "Successfully deleted object [data-volume-druid-workload-historicals-4:PersistentVolumeClaim] in namespace [default]"}

This issue is reproducible in the following environments:
druid-operator (0.0.9), kubernetes (1.23).

Storage Class:
Name: gp2
IsDefaultClass: Yes
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"},"name":"gp2"},"parameters":{"fsType":"ext4","type":"gp2"},"provisioner":"kubernetes.io/aws-ebs","volumeBindingMode":"WaitForFirstConsumer"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner: kubernetes.io/aws-ebs
Parameters: fsType=ext4,type=gp2
AllowVolumeExpansion:
MountOptions:
ReclaimPolicy: Delete
VolumeBindingMode: WaitForFirstConsumer
Events:

@chn217
Copy link
Author

chn217 commented Mar 26, 2023

I might hit the issue druid-io/druid-operator#305 which stays unfixed

@AdheipSingh
Copy link
Contributor

@chn217 ill take a look into this. Thanks for tracking this issue.

@gurjotkaur20
Copy link
Contributor

From the function, the possibility of race condition where PVC in use got deleted, is when the pod doesn't get listed in podList.
Could this be the possibility?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants