Domain 5 — Troubleshooting

← CKA Preparation 2026
CKA Preparation 2026 Domain 5 of 5
🔍 DOMAIN 5 · TROUBLESHOOTING ⚠️ 30% OF EXAM

Troubleshooting — The Exam’s Biggest Domain

20 labs covering every failure mode you will face in the exam and in production. This domain is worth 30% — more than any other. Start early, practice often.

⚠️ 30% Exam Weight — Highest
20 Labs · ~16 hours
0 / 20 Complete
Domain Progress0 / 20 labs
GitHub Repository
opscart / production-cka
📁 troubleshooting/ — labs coming soon
51
Pod Troubleshooting
Pending, init containers, readiness probe failures
⏸ Pending40 min

Objectives

  • Diagnose a pod stuck in Pending — no nodes, taints, resource pressure
  • Debug a pod stuck in Init:0/1 — init container failing
  • Fix a pod failing readiness probe — wrong port or endpoint

Key Commands

kubectl describe pod <pod> # Events section solves 80% of issues kubectl logs <pod> --previous kubectl logs <pod> -c init-container kubectl get events --sort-by=.lastTimestamp
⚡ Exam Tip

Every troubleshooting task: describe first, then logs. Events in describe output resolves 80% of problems before you need to look elsewhere.

52
Deployment Failures
Rollout stuck, image pull, config errors
⏸ Pending45 min

Objectives

  • Debug a deployment rollout stuck at 0/3 ready
  • Identify wrong image tag, missing ConfigMap key, missing Secret as root causes
  • Use kubectl rollout status and check ReplicaSet events
53
Node Troubleshooting
NotReady, disk pressure, memory pressure, kubelet failures
⏸ Pending50 min

Objectives

  • Diagnose a NotReady node — kubelet stopped, certificate expired, network plugin missing
  • Fix disk pressure by identifying and clearing large log files
  • Restart kubelet and verify node returns to Ready

Key Commands

kubectl describe node node01 systemctl status kubelet journalctl -u kubelet -n 100 --no-pager systemctl restart kubelet
54
Control Plane Failures
API server down, scheduler failures, crictl when kubectl is unavailable
⏸ Pending60 min

Objectives

  • Diagnose a broken API server static pod manifest
  • Fix a scheduler that stopped running
  • Use crictl when kubectl itself is unavailable

Key Commands

ls /etc/kubernetes/manifests/ # check here first crictl ps -a # works even when kubectl is down crictl logs <container-id>
⚡ Exam Tip

When kubectl does not respond, go straight to crictl. This appears on nearly every CKA exam.

55
Networking Issues
Pod-to-pod failures, CNI plugin down, NetworkPolicy blocking
⏸ Pending50 min

Objectives

  • Debug pod-to-pod connectivity failure due to CNI plugin crash
  • Identify a NetworkPolicy blocking legitimate traffic
  • Use kubectl exec with nc/curl to isolate where traffic breaks
56
Service Connectivity
Wrong selector, missing endpoints, port mismatch
⏸ Pending45 min

Objectives

  • Fix a Service with no endpoints — wrong label selector
  • Debug a port mismatch between Service port and container targetPort

Key Commands

kubectl get endpoints rx-api # none = selector mismatch kubectl describe svc rx-api kubectl get pods --show-labels
⚡ Exam Tip

Empty endpoints is the #1 service issue on the exam. If get endpoints shows none, the selector does not match any pod labels.

57
DNS Resolution Failures
CoreDNS down, broken Corefile, nxdomain debugging
⏸ Pending40 min

Objectives

  • Identify CoreDNS crash as root cause of cluster-wide DNS failure
  • Fix a broken Corefile causing CoreDNS to refuse to start
  • Debug cross-namespace DNS using full FQDN
58
Resource Exhaustion
OOMKilled, CPU throttling, node pressure evictions
⏸ Pending45 min

Objectives

  • Identify an OOMKilled container and increase its memory limit
  • Observe CPU throttling and fix with correct limits
  • Understand eviction thresholds and prevention
59
Application Logging
kubectl logs, multi-container pods, log streaming
⏸ Pending40 min

Objectives

  • Retrieve logs from a specific container in a multi-container pod
  • Stream logs and filter with grep
  • Access logs from a previously crashed container

Key Commands

kubectl logs <pod> -c <container> -f kubectl logs <pod> --previous kubectl logs <pod> --since=10m | grep ERROR
60
Monitoring Basics
kubectl top, metrics-server, resource visibility
⏸ Pending45 min

Objectives

  • Install metrics-server and verify kubectl top nodes and kubectl top pods
  • Identify highest CPU and memory consuming pods in the cluster
61
kubectl Debugging
exec, debug, port-forward, ephemeral containers
⏸ Pending35 min

Objectives

  • Add an ephemeral debug container with kubectl debug
  • Port-forward a service to local machine for testing
  • Copy files out of a container with kubectl cp

Key Commands

kubectl debug -it <pod> --image=busybox --target=<container> kubectl port-forward svc/rx-api 8080:80 kubectl cp <pod>:/app/logs/error.log ./error.log
62
Events Inspection
Cluster events, warning patterns, filtering
⏸ Pending40 min

Objectives

  • List all Warning events cluster-wide sorted by time
  • Filter events by resource type and namespace
  • Interpret BackOff, FailedScheduling, FailedMount messages

Key Commands

kubectl get events -A --sort-by=.lastTimestamp | grep Warning kubectl get events -n rx-prod --field-selector=reason=BackOff
63
Container Failures
Exit codes, OOMKilled, liveness probe failures
⏸ Pending45 min

Objectives

  • Interpret exit codes: 0=success, 1=error, 137=OOMKilled, 143=SIGTERM
  • Fix a failing liveness probe — wrong path, wrong port, too aggressive timing
  • Understand restartPolicy: Always, OnFailure, Never
64
CrashLoopBackOff
Root causes, systematic debugging, fix patterns
⏸ Pending40 min

Objectives

  • Systematically identify the 6 most common CrashLoopBackOff causes
  • Fix: wrong command, missing env var, missing Secret, wrong image, permission denied
  • Override entrypoint temporarily to keep container alive for debugging
⚡ Exam Tip

Always check kubectl logs pod –previous first. Crash logs from the last run are the fastest path to root cause.

65
ImagePullBackOff
Wrong tag, private registry, missing imagePullSecret
⏸ Pending35 min

Objectives

  • Fix a pod failing with ImagePullBackOff due to wrong image tag
  • Create a docker-registry Secret and add it as imagePullSecrets
  • Distinguish ErrImagePull from ImagePullBackOff
66
Certificate Issues
Expired certs, kubeadm cert renew, TLS debugging
⏸ Pending50 min

Objectives

  • Check certificate expiry dates for all cluster certificates
  • Renew all certificates with kubeadm certs renew all
  • Debug TLS handshake failures using openssl s_client

Key Commands

kubeadm certs check-expiration kubeadm certs renew all openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -dates
67
kubelet Failures
Config errors, certificate issues, cgroup driver mismatch
⏸ Pending45 min

Objectives

  • Diagnose kubelet failure from journalctl output
  • Fix a wrong –config path in the kubelet systemd unit
  • Resolve cgroup driver mismatch between containerd and kubelet
68
Control Plane Issues (Advanced)
etcd quorum loss, scheduler malfunction, API server flags
⏸ Pending55 min

Objectives

  • Simulate etcd quorum loss and restore from snapshot
  • Fix a scheduler that stopped assigning pods
  • Identify a broken API server flag preventing admission
69
Performance Tuning
Resource requests, limits tuning, node capacity planning
⏸ Pending50 min

Objectives

  • Identify over-provisioned and under-provisioned pods using metrics
  • Set correct requests based on actual observed usage
  • Use LimitRange to enforce sensible namespace defaults
70
🎯 Exam Scenarios — Full Simulation
Timed 2-hour practice exam across all 5 domains
⏸ Pending120 min

What this lab contains

  • 20 tasks covering all 5 domains at exam weight proportions
  • Strict 2-hour limit — no solutions mid-way
  • Cluster pre-broken in multiple ways — find and fix each issue
  • Scoring: 74%+ = exam ready
⚡ Final Advice

Complete this lab twice before booking. First attempt reveals gaps. Second confirms readiness. Score 74%+ under time pressure and you are ready.

🏭 Also use killer.sh

Your CKA registration includes 2 free killer.sh simulator sessions. Use them in Weeks 9 and 10. killer.sh is intentionally harder than the real exam.

Scroll to Top