1. Home
  2. Infrastructure Fixes
  3. Solving Stuck PVCs in EKS with EBS

Solving Stuck PVCs in EKS with EBS




Summary: Stuck PersistentVolumeClaims (PVCs) can stall Kubernetes workloads and leak AWS EBS volumes. This guide explains how to diagnose and fix PVCs in EKS clusters using AWS EBS, including symptoms, root causes, resolution workflows, and cleanup automation.

1. Introduction

If you’ve run Kubernetes workloads on Amazon EKS using EBS-backed volumes, you’ve likely encountered stuck PVCs claims that never bind, hang during termination, or leave orphaned volumes behind. These issues increase infrastructure drift, block application rollout, and inflate storage costs. This playbook shows how to eliminate stuck volumes and design safer PVC usage patterns in production clusters.

2. Purpose

This article provides a standardized troubleshooting guide for identifying and resolving stuck PVCs in EKS with AWS EBS. It helps DevOps and platform teams eliminate residual volumes, reduce downtime, and enforce correct storage configuration across namespaces.

3. Scope

This playbook covers AWS EBS-backed PVCs that enter stuck Pending or Terminating states in Kubernetes. It assumes EBS CSI or in-tree volumes are provisioned using the default gp2 or gp3 storage classes.

4. Common Symptoms of a Stuck PVC

  • PVC stuck in Pending (no bound PersistentVolume)
  • Pod stuck in ContainerCreating due to failed volume attach
  • PVC stuck in Terminating due to dangling finalizers
  • Volume listed as in-use in AWS after node is gone

5. Root Causes of PVC Failures

  • EBS volume still attached to a non-responsive or deleted node
  • Incorrect reclaim policy (set to Retain without follow-up cleanup)
  • Manual StorageClass overrides misaligned with cluster policy
  • Node failure causing VolumeAttachment object to persist

6. How to Diagnose a Stuck PVC

  1. Get PVC status and related PV:
    kubectl get pvc,pv -n <namespace>
  2. Describe the PVC and inspect events:
    kubectl describe pvc <pvc-name> -n <namespace>
  3. Check volume usage on the instance (if attached):
    lsblk | grep nvme
  4. Use AWS CLI to check volume state:
    aws ec2 describe-volumes --volume-ids <vol-id>

7. Fixing Stuck PVCs in EKS

⚠️ Warning: Force-removing finalizers or manually detaching volumes can lead to data loss if the volume is still mounted or in use. Only proceed after verifying the Pod is terminated and the volume is no longer needed. Always coordinate with application owners before taking destructive action on stateful workloads.
  • Patch the PVC to remove finalizers:
    kubectl patch pvc <pvc-name> -n <namespace> -p '{"metadata":{"finalizers":null}}' --type=merge
  • Detach the volume manually from AWS:
    aws ec2 detach-volume --volume-id <vol-id>
  • Update reclaim policy to auto-delete:
    kubectl patch pv <pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'
  • Delete VolumeAttachment object if stuck:
    kubectl delete volumeattachment <vol-attach-id>

8. Prevention and Best Practices

  • Use reclaim policy Delete for short-lived workloads
  • Automate cleanup of dangling volumes with cronjobs or Lambda
  • Monitor PVC lifecycle via kube-state-metrics or Prometheus rules
  • Tag AWS volumes with Pod/namespace metadata for traceability
  • Enforce default StorageClass with allowVolumeExpansion: true

9. Implementation Checklist

  • ☑️ StorageClass defined with proper provisioner and reclaimPolicy
  • ☑️ Finalizers removed automatically if PVC deletion fails
  • ☑️ Monitoring alerts for stuck PVCs and VolumeAttachments
  • ☑️ AWS IAM roles allow volume detach and describe permissions
  • ☑️ Volumes labeled with cluster name and namespace for auditability

10. References

Updated on June 4, 2025
Was this article helpful?

Related Articles

Leave a Comment