Validator Incident Response Playbook

Validator downtime can quickly escalate into missed blocks, jail events, and potential slashing depending on chain rules. Fast response and disciplined operational procedures are essential for maintaining validator uptime and delegator trust.

This guide outlines the practical response workflow when validator alerts indicate a potential issue. All commands below are verified against a live Infinite Drive mainnet validator.

Operators commonly rely on monitoring systems to detect validator health issues quickly. FoxxOne Validator Alerts provides Telegram notifications for missed blocks, jailed status, governance proposals, and stake movement events.

All Guides Get Validator Alerts

Last updated: 2026-06-27

Validator Incident Response Guide

Structured response flow for live operational incidents. Read the workflow phases below first, then jump to the Terminal Runbook for verified commands.

Preventing Validator Incidents

Most validator incidents come from predictable operational faults. Baseline prevention should include:

Maintaining healthy peer connectivity.
Monitoring node sync status and block-height drift.
Tracking missed block counters continuously.
Ensuring signer process stability and key access health.
Watching disk space — full disks silently halt the node.
Keeping the system clock in tight NTP sync — Cosmos consensus rejects out-of-sync signers.
Maintaining validator infrastructure alerts as a first-line signal.

Runbook Workflow

Detection

Missed blocks or jailed validator alerts are usually the first indicator of an operational issue.

Immediate Response

Acknowledge the alert and record a timestamp.
Confirm if the issue is local (your stack) or chain-wide. Compare your local block height against a known-good RPC — if both are stuck, the chain itself has halted (very rare); if only yours is stuck, the issue is local.
Pause non-essential maintenance until validator health stabilizes.

Diagnosis

Check node logs for consensus/signing/runtime errors.
Check sync state and block-height movement (catching_up must be false before any recovery action).
Check disk space on the data partition — a full disk silently freezes the node.
Check peer connectivity and network latency.
Check signer process, key access, and service state.
Check system clock synchronisation (NTP drift greater than ~1s causes signing rejection).

Recovery

Restart services only after cause is identified — restarting without fixing the root cause leads to repeated jail cycles.
NEVER start the validator on a new machine without first stopping the old one and verifying the last signed block in priv_validator_state.json. Double-signing results in permanent tombstoning — no recovery possible.
If jailed: check tombstone status first (permanent if true), then confirm self-delegation is above the minimum, then execute the chain-specific unjail flow only when fully synced.
Monitor block signing recovery and missed-block trend after recovery.

Post-Incident Review

Document the incident and timeline in your ops log.
Update your runbook with the exact remediation sequence.
Adjust monitoring thresholds or alert routing where needed.

Terminal Runbook (Infinite Chain)

Commands verified against a live Infinite Drive mainnet validator. They assume the FoxxOne Drive deployment layout — if your infinited binary lives elsewhere, drop the ./drive.sh exec infinite wrapper and use infinited directly with the same flags.

Two chain IDs apply to Infinite Drive — don't confuse them:

Cosmos SDK chain ID (used by all infinited CLI commands): mainnet infinite_421018-1, testnet infinite_421018001-1.
EVM chain ID (used by MetaMask and EVM dApps): mainnet 421018 / 0x66c9a, testnet 421018001 / 0x19183991.

Every --chain-id flag in this guide uses the Cosmos SDK form. If you're on testnet, substitute infinite_421018001-1 everywhere it says infinite_421018-1, and use your testnet service directory.

Service directory: ~/drive/services/node0-infinite
Run commands from this directory unless noted.
Keyring backend: file (prompts for passphrase on tx commands).
Ignore the proto: duplicate proto type registered noise — those lines fire on every command and are harmless.
Always pass -o json to queries piped through jq (the binary defaults to YAML output otherwise).
Replace <your-key-name> with the name from infinited keys list (e.g. validator-key) and <your-valoper-address> / <your-wallet-address> with your validator's bech32 addresses.

Verify Node Sync Status

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited status -o json | jq '.sync_info'

Critical check: catching_up must be false. If true, do not attempt unjail.

Quick Sync Check

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited status -o json | jq -r '.sync_info.catching_up'

Expected output: false

Chain-vs-Local Halt Check

If your node is stuck, compare its block height to a public Infinite Drive endpoint. If both are stuck at the same height, the chain itself has halted (rare). If only yours is stuck, it's a local issue. Infinite Drive is a Cosmos-EVM chain — block heights are identical across both layers, so an EVM RPC peer works for this comparison.

cd ~/drive/services/node0-infinite

# Local Cosmos block height (decimal)
./drive.sh exec infinite infinited status -o json | jq -r '.sync_info.latest_block_height'

# Public peer block height via Infinite Drive EVM RPC (hex result, converted to decimal)
curl -sS -X POST https://evm-rpc.infinitedrive.xyz \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
  | jq -r '.result' | xargs printf "%d\n"

The two values should match within ~1-2 blocks (in-flight delay). If your local is stuck thousands of blocks behind, it's a local sync issue. If the peer is also stuck at the same height as yours, the chain itself has halted.

Disk Space (top cause of silent node halt)

df -h ~/drive/services/node0-infinite

Anything below ~5% free needs urgent action: prune state, restore from snapshot, or expand disk.

System Clock / NTP Sync

timedatectl status | grep -E "synchronized|NTP"

If "System clock synchronized: no", fix NTP before any other recovery action — Cosmos rejects signers more than ~1s out of sync.

Get Your Validator's Consensus Address

Slashing queries take a consensus address (infinitevalcons1…), not your operator or wallet address. Get yours:

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited comet show-address

Save the output — it's used in the slashing queries below.

Alert-Specific Quick Checks

These checks map directly to common alert types and avoid duplicating the sync/unjail steps below.

Missed Blocks Signal — your validator specifically

The first line below captures your validator's consensus address into $VALCONS so the rest of the slashing commands in this guide can reuse it. Run all three lines in one paste.

cd ~/drive/services/node0-infinite
VALCONS=$(./drive.sh exec infinite infinited comet show-address)
./drive.sh exec infinite infinited q slashing signing-info "$VALCONS" -o json | jq

Key fields: missed_blocks_counter (current count), jailed_until (timestamp when an active jail expires), and tombstoned (true = validator permanently removed — see Recovery section). When values are zero/false the SDK omits them from JSON — use // "0" or // false fallbacks in jq if you script around this.

Slashing Parameters (jail thresholds)

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited q slashing params -o json | jq

Infinite Drive mainnet values (as of 2026-06-27): signed_blocks_window 10,000, min_signed_per_window 0.05, downtime_jail_duration 10m, slash_fraction_double_sign 5%, slash_fraction_downtime 0.01%. In practice: jail triggers at 9,500 missed blocks within a 10,000-block window — generous downtime tolerance, but double-signing slashes 5% of stake and tombstones the validator. The 10-minute jail period is short — quick recovery is possible once root cause is fixed.

Governance Voting Window

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited q gov proposals --status voting_period -o json | jq

Delegation / Stake Movement Baseline

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited q staking validator <your-valoper-address> -o json | jq '.tokens,.delegator_shares,.min_self_delegation'

Validator Jailed Recovery (Step-by-Step)

Validator Jailed Recovery

Follow this in order. Skipping a step is the #1 cause of "unjail tx fails with cryptic error" stories.

Step 0 — Tombstone Check (PERMANENT if true)

If the validator double-signed at any point, it's tombstoned and cannot ever be unjailed. Check before doing anything else. The first line captures your valcons address — paste all three lines.

cd ~/drive/services/node0-infinite
VALCONS=$(./drive.sh exec infinite infinited comet show-address)
./drive.sh exec infinite infinited q slashing signing-info "$VALCONS" -o json \
  | jq -r '.val_signing_info.tombstoned // false'

If output is true, stop — there is no recovery path. If false, proceed.

Step 1 — Confirm Node Is Synced

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited status -o json | jq -r '.sync_info.catching_up'

Only continue if output is false.

Step 2 — Confirm Current Jailed State

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited q staking validator <your-valoper-address> -o json | jq '.jailed'

If false already, you're not jailed — skip to monitoring. If true, continue.

Step 3 — Check Self-Delegation vs Minimum

If your self-bonded stake has fallen below min_self_delegation, the unjail tx will fail with a "validator self-delegation below minimum" error. Check first.

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited q staking validator <your-valoper-address> -o json \
  | jq '.min_self_delegation,.tokens,.delegator_shares'
./drive.sh exec infinite infinited q staking delegation <your-wallet-address> <your-valoper-address> -o json \
  | jq '.balance'

Step 4 — Top Up Self-Delegation (only if below minimum)

Skip if your self-delegation is already above the minimum. Otherwise delegate enough drop to push it back over.

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited tx staking delegate <your-valoper-address> 1000000000000000000drop \
  --from <your-key-name> \
  --chain-id infinite_421018-1 \
  --keyring-backend file \
  --home ~/.infinited \
  --gas auto --gas-adjustment 1.5 \
  --gas-prices 5000000000drop -y'

1000000000000000000drop = 1 × 10¹⁸ drop (one whole DROP at 18-decimal precision). Adjust the amount to whatever pushes your self-bond above min_self_delegation.

Step 5 — Check Wallet Balance for Tx Fees

The unjail tx fails silently with "insufficient funds" if your wallet is empty. Confirm gas is available.

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited q bank balances <your-wallet-address> --home ~/.infinited -o json | jq'

Step 6 — Execute Unjail Transaction

--from is the local key name stored in the node keyring, not the wallet address. To list keys:

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited keys list --keyring-backend file --home ~/.infinited

Use the name shown in output for --from below.

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited tx slashing unjail \
  --from <your-key-name> \
  --chain-id infinite_421018-1 \
  --keyring-backend file \
  --home ~/.infinited \
  --gas auto --gas-adjustment 1.5 \
  --gas-prices 5000000000drop -y'

Step 7 — Confirm Validator Is Active

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited q staking validator <your-valoper-address> -o json | jq '.jailed,.status'

Expect .jailed: false and .status: "BOND_STATUS_BONDED".

Step 8 — Monitor Signing Recovery

Watch missed_blocks_counter drop back toward zero over the next several block windows. Capture the valcons address once, then loop:

cd ~/drive/services/node0-infinite
VALCONS=$(./drive.sh exec infinite infinited comet show-address)
watch -n 6 "./drive.sh exec infinite infinited q slashing signing-info $VALCONS -o json | jq '.val_signing_info'"

Operational Utilities — rewards, commission, voting, balances, restart

Check Pending Rewards (delegator-side)

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited q distribution rewards <your-wallet-address> --home ~/.infinited -o json | jq'

Check Pending Commission (validator-side)

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited q distribution commission <your-valoper-address> --home ~/.infinited -o json | jq'

Withdraw All Delegation Rewards

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited tx distribution withdraw-all-rewards \
  --from <your-key-name> \
  --chain-id infinite_421018-1 \
  --keyring-backend file \
  --home ~/.infinited \
  --gas auto --gas-adjustment 1.3 --gas-prices 5000000000drop -y'

Withdraw Validator Commission

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited tx distribution withdraw-rewards <your-valoper-address> \
  --commission \
  --from <your-key-name> \
  --chain-id infinite_421018-1 \
  --keyring-backend file \
  --home ~/.infinited \
  --gas auto --gas-adjustment 1.3 --gas-prices 5000000000drop -y'

Delegate / Restake to Validator

After withdrawing rewards or commission, push those tokens back into your validator to compound your stake. Same command also works for an initial delegation or any later top-up.

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited tx staking delegate \
  <your-valoper-address> \
  400000000000000000000drop \
  --from <your-key-name> \
  --chain-id infinite_421018-1 \
  --keyring-backend file \
  --home ~/.infinited \
  --gas auto --gas-adjustment 1.3 --gas-prices 5000000000drop -y'

400000000000000000000drop = 400 × 10¹⁸ drop (400 whole DROP at 18-decimal precision). Adjust the amount to match what you actually want to delegate — check your wallet balance first (next step) and leave some headroom for future gas.

Wallet Balance Check

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited q bank balances <your-wallet-address> --home ~/.infinited -o json | jq'

Governance Vote

Replace 7 with the proposal ID you're voting on, and yes with no, abstain, or no_with_veto as needed.

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited tx gov vote 7 yes \
  --from <your-key-name> \
  --chain-id infinite_421018-1 \
  --keyring-backend file \
  --home ~/.infinited \
  --gas auto --gas-adjustment 1.3 --gas-prices 5000000000drop -y'

Emergency Node Restart

Last-resort restart when the node is unresponsive or wedged. Stops the process cleanly then starts it again.

./drive.sh exec infinite node-stop
./drive.sh exec infinite node-start

View Node Logs

cd ~/drive/services/node0-infinite
./drive.sh node-logs

Show full status & identity reference

Check Current Block Height

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited status -o json | jq -r '.sync_info.latest_block_height'

Check Latest Block Time

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited status -o json | jq -r '.sync_info.latest_block_time'

Confirm Chain ID

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited status -o json | jq -r '.node_info.network'

Expected output (mainnet): infinite_421018-1. Testnet: infinite_421018001-1.

View Validator Identity

cd ~/drive/services/node0-infinite
./drive.sh exec infinite infinited status -o json | jq '.validator_info'

Get Your Wallet Address from Key Name

cd ~/drive/services/node0-infinite
./drive.sh exec infinite bash -lc 'infinited keys show <your-key-name> -a --keyring-backend file --home ~/.infinited'

Operational Guidance

Unjailing without fixing root cause leads to repeated jail cycles and higher slashing risk. Common causes include sync lag, peer/network issues, process crashes, server resource exhaustion (especially disk), and NTP drift. After recovery, monitor signing behaviour and the missed-block trend for several block windows before considering the incident closed.

Double-signing prevention: if you ever move the validator to a new machine, ALWAYS stop the old one and verify the last signed height in ~/drive/services/node0-infinite/data/priv_validator_state.json before starting the new one. Two instances signing concurrently = tombstone = permanent removal.

Why Rapid Response Matters

Missing blocks for extended periods can lead to validator jail. Repeated incidents can result in delegator withdrawals and reduced validator reputation. Monitoring and fast operational response help minimise downtime and protect stake participation.

On Infinite Drive specifically: downtime tolerance is generous (~9,500 missed blocks within a 10,000-block window before jail, jail duration only 10 minutes), but double-signing is severe — 5% slash plus permanent tombstoning. The playbook prioritises preventing double-signs over rushing recovery.

Operational Monitoring

Operational monitoring lets validator operators detect issues early and respond before incidents escalate. FoxxOne Validator Alerts (@FoxxWatch_bot) provides Telegram notifications for missed blocks, jailed validator events, governance proposals, and stake movement monitoring.

This enables faster response when validator health changes.

Get Validator Alerts

Traveller Settings