Troubleshooting quick tips
Something went wrong? Here are the most common issues and how to fix them fast. For a more detailed guide, see the full troubleshooting page.
Pods are not running
Check the status of all pods:
kubectl get po -A
Every pod should be in Running or Completed state. If a pod is stuck in Pending, CrashLoopBackOff, or Error, describe it to get details:
kubectl -n armonik describe po <PodName>
Tasks are submitted but never processed
Check the partition name — make sure the client is submitting to a partition that exists and has workers assigned to it.
Open logs — check the logs for errors from the worker or control plane. On a local deployment, use Seq. On cloud deployments, use the equivalent service (CloudWatch on AWS, Cloud Monitoring on GCP).
Check worker pods — run
kubectl -n armonik get poand look for worker pods in an error state.
Local deployment stops working after a few days
MongoDB and Redis certificates generated during a local deployment are valid for only 7 days by default. After that, services will fail.
The quickest fix is to destroy and redeploy:
terraform destroy
terraform apply
To avoid this in the future, set validity_period_hours in your parameters.tfvars before deploying (e.g. 8760 for one year).
Cannot connect to the Admin GUI or Seq
Retrieve the URLs from the Terraform outputs:
# Seq
cat monitoring/generated/monitoring-output.json | grep seq
# Admin GUI
cat armonik/generated/armonik-output.json | grep admin_gui
Default ports are 8080 for Seq and 5000 for the Admin GUI.
A pod is stuck in Terminating
Force-delete it:
kubectl -n armonik delete po --force <PodName>
Verify it is gone:
kubectl -n armonik get po --field-selector metadata.name=<PodName>