Troubleshooting quick tips

Something went wrong? Here are the most common issues and how to fix them fast. For a more detailed guide, see the full troubleshooting page.

Pods are not running

Check the status of all pods:

kubectl get po -A

Every pod should be in Running or Completed state. If a pod is stuck in Pending, CrashLoopBackOff, or Error, describe it to get details:

kubectl -n armonik describe po <PodName>

Check the partition name — make sure the client is submitting to a partition that exists and has workers assigned to it.
Open logs — check the logs for errors from the worker or control plane. On a local deployment, use Seq. On cloud deployments, use the equivalent service (CloudWatch on AWS, Cloud Monitoring on GCP).
Check worker pods — run kubectl -n armonik get po and look for worker pods in an error state.

MongoDB and Redis certificates generated during a local deployment are valid for only 7 days by default. After that, services will fail.

The quickest fix is to destroy and redeploy:

terraform destroy
terraform apply

To avoid this in the future, set validity_period_hours in your parameters.tfvars before deploying (e.g. 8760 for one year).

Retrieve the URLs from the Terraform outputs:

# Seq
cat monitoring/generated/monitoring-output.json | grep seq

# Admin GUI
cat armonik/generated/armonik-output.json | grep admin_gui

Default ports are 8080 for Seq and 5000 for the Admin GUI.

Force-delete it:

kubectl -n armonik delete po --force <PodName>

Verify it is gone:

kubectl -n armonik get po --field-selector metadata.name=<PodName>