ark retry
Retry failed pipeline steps in a live cluster, preserving steps that already succeeded - full flags reference for ark retry.
Reset failed steps back to Pending so the operator re-submits them. Steps that already succeeded are left untouched.
ark retry <team> [flags]
Flags
| Flag | Default | Description |
|---|---|---|
-n, --namespace | - | Namespace of the ArkTeam (required) |
--step | - | Retry only this specific step. Defaults to all failed steps. |
--context | current context | Kubernetes context to use |
Examples
# Retry all failed steps
ark retry research-pipeline -n my-org
# Retry a single step
ark retry research-pipeline -n my-org --step summarize
# Retry in a specific cluster
ark retry research-pipeline -n my-org --context prod-cluster
What retry does
ark retry is a surgical operation - it only touches failed steps:
- Resets
phaseof failed steps toPending - Clears
message,output, andcompletionTimefor those steps - Sets
status.phasetoRunning
Steps with phase: Succeeded are not modified. The operator picks up the change and re-submits only the reset steps on the next reconcile.
This is the right tool when a step failed due to a transient error (network timeout, rate limit, bad LLM response) and you want to resume without re-running the work that already completed.
Retry vs trigger
ark retry | ark trigger | |
|---|---|---|
| Resets failed steps | ✅ | ✅ |
| Preserves succeeded steps | ✅ | ✗ |
Accepts new --input | ✗ | ✅ |
| Use when | Resuming after failure | Starting fresh |
See also
- ark trigger - start a full re-run from scratch
- ark runs - view run history to identify what failed
- Step output validation - automatic retry on validation failure