Skip to content

Disaster Recovery

Define how to recover Get2Dial after a major failure — host loss, data corruption, or region outage.

DR builds on backups and reproducible deploys. The goal is to rebuild the control plane from infrastructure-as-code and restore state from backups, then re-point or rebuild edges. Define and track your RPO (acceptable data loss) and RTO (acceptable downtime).

Keep compose files in VCS and secrets in a vault so a new host can be stood up from scratch. Pin image tags for reproducibility.

Recovery outline:

  1. Provision a new host; install Docker + Caddy.
  2. Restore PostgreSQL and MinIO from off-site backups (see Restore).
  3. Bring up the stack with a pinned IMAGE_TAG.
  4. Re-register edges and verify call flow end to end.
  • DR is only real if rehearsed — schedule game-days.
  • Know your RPO/RTO targets and measure against them.