Step Output Validation

Validate pipeline step outputs automatically using regex, JSON Schema, or a secondary LLM call before passing them to downstream steps.

Pipeline steps can fail silently - the agent produces an output that technically completes the task but contains hallucinated data, wrong format, or missing content. Step output validation catches these issues before they propagate to downstream steps or surface as a corrupted final result.

Add a validate: block to any pipeline step in ArkTeam.spec.pipeline.


Validation modes

Three modes are available. They can be combined - all must pass for the step to succeed.

contains - regex match

Checks that the output matches a RE2 regular expression. Useful for asserting structural requirements like JSON keys, required phrases, or numeric patterns.

pipeline:
  - role: extractor
    inputs:
      prompt: "Extract the price from this text: {{ .input.text }}"
    validate:
      contains: '"price":\s*\d+'   # output must contain a price field
      onFailure: retry
      maxRetries: 2

schema - JSON Schema

Validates that the output is valid JSON and conforms to a JSON Schema object. Use this when downstream steps consume structured data.

pipeline:
  - role: classifier
    inputs:
      prompt: |
        Classify this support ticket. Return JSON only:
        {"category": "billing|technical|other", "priority": "low|medium|high", "summary": "..."}
    validate:
      schema: |
        {
          "type": "object",
          "required": ["category", "priority", "summary"],
          "properties": {
            "category": {"type": "string", "enum": ["billing", "technical", "other"]},
            "priority": {"type": "string", "enum": ["low", "medium", "high"]},
            "summary": {"type": "string", "minLength": 1}
          }
        }
      onFailure: retry
      maxRetries: 3

When schema validation passes, the parsed JSON is also available to downstream steps as {{ .steps.<name>.data.<field> }}:

- role: router
  dependsOn: [classifier]
  inputs:
    priority: "{{ .steps.classifier.data.priority }}"
    category: "{{ .steps.classifier.data.category }}"

semantic - LLM call

Sends the step’s output to a secondary LLM call with a custom validator prompt. The validator returns pass or fail: <reason>. Use this for quality checks that rules cannot express - factual accuracy, tone, completeness.

pipeline:
  - role: writer
    inputs:
      prompt: "Write a product description for: {{ .input.product }}"
    validate:
      semantic:
        prompt: |
          Does the following product description mention the product name,
          at least one key benefit, and a call to action?
          Answer "pass" or "fail: <reason>".
        model: claude-haiku-4-5-20251001   # optional - defaults to the step's model
      onFailure: retry
      maxRetries: 2

onFailure options

ValueBehaviour
retryReset the step to Pending and re-run it (up to maxRetries times). Useful when the agent is non-deterministic and a retry often produces a valid result.
failImmediately fail the step (and the run). Use this when a bad output must never reach downstream steps.

maxRetries defaults to 1. The total number of attempts is maxRetries + 1.


Combining modes

All three modes can be set together. The step fails (or retries) if any check does not pass:

validate:
  contains: '"status":'
  schema: |
    {"type": "object", "required": ["status", "result"]}
  semantic:
    prompt: "Is the result field a meaningful non-empty response? Answer pass or fail."
  onFailure: retry
  maxRetries: 2

Observing validation results

Validation attempts and the most recent failure reason are recorded in ArkRun.status.steps[]:

kubectl get arkrun <run-name> -n my-org -o jsonpath='{.status.steps[?(@.name=="classifier")]}' | jq
{
  "name": "classifier",
  "phase": "Succeeded",
  "validationAttempts": 2,
  "validationMessage": "",
  "output": "{\"category\": \"billing\", \"priority\": \"high\", \"summary\": \"...\"}"
}

validationAttempts increments on each validation run. validationMessage holds the last failure reason and is cleared when validation passes.


Works with ark run locally

Validation runs the same way with ark run:

ark run pipeline.yaml --watch --input topic="KEDA autoscaling"

Semantic validation requires a live LLM provider. Use --provider mock to skip it during structural testing.


See also