Troubleshooting

Common failures and how to resolve them.

App container restarts on boot

Run docker logs ir-mlab-app-1. Most common causes:

  • DB not healthy yet — harmless; the app retries until MySQL is up.
  • Invalid SESSION_SECRET — must be at least 32 characters.
  • Missing LICENSE_KEY — the app refuses to boot without one.

Healthcheck returns license: locked

The instance hasn't been able to contact mlab.sh for 48+ hours. Check outbound HTTPS to mlab.sh:443 from the executor container. Once connectivity restores, the lock clears within one hour automatically.

Alerts not appearing in the queue

  • Confirm the API key has the alerts:write scope.
  • Check raw request logs under Settings > Audit log.
  • If 409 deduplicated — an existing alert absorbed it. Look up by external_id.
  • If suppressed — an active suppression rule matched. Review under Settings > Suppression.

Triage queue is slow

Two likely causes:

  • ClickHouse OOM — analytics-heavy filters need RAM. Bump the container to 4 GB if you're running the default 2 GB.
  • Retention not applied — check RETENTION_ALERTS_DAYS isn't unbounded. Closed alerts older than that are purged nightly.

Webhook delivery failing

Check Settings > Webhooks > Delivery log. Each attempt records the response code and body. Retries follow exponential backoff up to 24 h before the webhook is auto-paused.

Forgot admin password

Run the recovery command inside the app container:

terminal
docker exec -it ir-mlab-app-1 \
  /app_mlab_sh/bin/admin-reset --email admin@localhost

A one-time reset token is printed; use it at /auth/reset?token=....

Still stuck?

Email [email protected] with the output of /healthz, your license tier and the relevant docker logs snippet. Paid plans get priority routing.