Troubleshooting — Kubernetes
Nodes in NotReady state
Cause: one or more nodes are no longer responding to the control plane. This can be due to insufficient resources, a storage issue, or a kubelet failure.
Solution:
- Check node status and conditions:
kubectl get nodes
kubectl describe node <node-name> - Review events to identify the cause (DiskPressure, MemoryPressure, PIDPressure):
kubectl get events --sort-by='.lastTimestamp' - Verify that the chosen
instanceTypeprovides enough resources for deployed workloads. - If the issue persists, increase the nodeGroup
maxReplicasto allow the cluster to provision new healthy nodes.
Pods in Pending state (insufficient resources)
Cause: no node has enough CPU or memory to schedule the pod. The Kubernetes scheduler cannot find a placement.
Solution:
-
Identify the reason for Pending:
kubectl describe pod <pod-name>Look for the
FailedSchedulingmessage in events. -
Check available resources on nodes:
kubectl top nodes -
If nodes are saturated, increase the
maxReplicasof your nodeGroup:cluster.yamlspec:
nodeGroups:
workers:
minReplicas: 2
maxReplicas: 10 -
If the pod is stuck on a PVC, verify that the PVC is properly provisioned:
kubectl get pvc
Expired or invalid kubeconfig
Cause: the client certificate in the kubeconfig has expired (x509: certificate has expired error) or the credentials are invalid (Unauthorized error).
Solution:
-
Regenerate the kubeconfig from the source Secret:
kubectl get tenantsecret <cluster-name>-admin-kubeconfig -o jsonpath='{.data.super-admin\.conf}' | base64 -d > kubeconfig.yaml -
Replace your old kubeconfig file:
export KUBECONFIG=kubeconfig.yaml -
Verify connectivity:
kubectl cluster-info
Ingress returns 404
Cause: the Ingress resource is misconfigured or the ingressNginx addon is not enabled on the cluster.
Solution:
-
Verify that the
ingressNginxaddon is enabled in the cluster configuration:cluster.yamlspec:
addons:
ingressNginx:
enabled: true -
Verify that
ingressClassNameis specified in your Ingress:ingress.yamlapiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app-svc
port:
number: 80 -
Verify that the backend (Service + Pod) is running:
kubectl get pods -l app=my-app
kubectl get svc my-app-svc -
Check the host and path configuration in the Ingress rule.
PVC in Pending state
Cause: the requested storageClass does not exist or the storage capacity is insufficient.
Solution:
-
The available storageClasses on Hikube are:
local,replicated, andreplicated-async. -
Make sure the name used in your PVC matches an existing storageClass:
pvc.yamlapiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: replicated
resources:
requests:
storage: 10Gi -
Check events related to the PVC:
kubectl describe pvc my-data -
If capacity is insufficient, reduce the requested size or contact Hikube support.