$ replaybook

Scenarios

Scenario packs

Scenarios live in separate repos and are cloned into ~/.local/share/replaybook/scenarios/ via replaybook add.

Official pack: ducks/replaybook-scenarios

IDTitleDifficulty
001-nginx-502502 Bad Gateway1
002-postgres-wont-startPostgres Won't Start1
003-missing-env-varApp Crashing on Boot1
004-disk-fullHealth Checks Failing2
005-oom-killContainer Keeps Restarting2
006-sidekiq-cant-connectJobs Not Processing2
007-packet-lossIntermittent Request Failures3

Writing scenarios

Each scenario is a directory with:

my-scenario/
  meta.json            # id, title, page text, difficulty, hints, success condition
  docker-compose.yml   # the environment
  break.sh             # runs after compose up to inject the fault (or use break: [...] below)
  check.sh             # polled every 2s to detect resolution (or use http_200)

meta.json format:

{
  "id": "my-scenario",
  "title": "Something Is Broken",
  "page": "alert text shown to the player",
  "difficulty": 2,
  "hints": [
    "First hint revealed on first get-hint",
    "Second hint revealed on second get-hint"
  ],
  "success_condition": "http_200",
  "success_target": "http://localhost:8080/health",
  "shell_service": "app"
}

shell_service is the compose service the player is dropped into. If unset, defaults to the first service defined in docker-compose.yml. See any scenario in ducks/replaybook-scenarios for a working example.

Fault injection: break.sh vs break steps

Most faults are just "copy a file in" and/or "run a command in a container." Instead of writing break.sh, add a break array to meta.json:

"break": [
  { "cp": { "service": "nginx", "src": "nginx-broken.conf", "dest": "/etc/nginx/nginx.conf" } },
  { "exec": { "service": "nginx", "cmd": ["nginx", "-s", "reload"] } }
]

Steps run in order. Three kinds:

"break": [
  { "exec": { "service": "db", "cmd": ["chown", "-R", "root:root", "/var/lib/postgresql/data"] } },
  { "restart": { "service": "db" } }
]

If break is present, it's used instead of break.sh. If a fault needs real script logic (loops, conditionals, piping between commands), write break.sh instead — it still works exactly as before.

Validation

replaybook add and replaybook run validate each scenario before anything runs:

replaybook add reports problems for every scenario in a pack without blocking the rest of it. replaybook run re-checks the single scenario it's about to launch and refuses to start if it fails.