ArkAgent

ArkAgent manages a pool of LLM agent instances with configurable models, prompts, MCP servers, token limits, and semantic health checks.

API: arkonis.dev/v1alpha1 Kind: ArkAgent Short name: arkagent Scope: Namespaced

ArkAgent manages a pool of LLM agent instances. You declare what an agent knows, what it can do, and what it’s allowed to spend — the operator keeps that state running, healthy, and within budget.

What an agent is

An agent is a long-running process that:

Reads configuration from environment variables injected by the operator
Connects to configured MCP tool servers at startup
Polls the task queue for work
Calls the configured LLM provider with the task and available tools
Runs the tool-use loop until the model stops invoking tools
Returns the result to the queue

The agent binary (ark-runtime) has no Kubernetes dependencies. The same binary runs in-cluster and locally via ark run.

Example

apiVersion: arkonis.dev/v1alpha1
kind: ArkAgent
metadata:
  name: research-agent
  namespace: default
spec:
  replicas: 2
  model: llama3.2
  systemPromptRef:
    configMapKeyRef:
      name: research-prompt
      key: system.txt
  mcpServers:
    - name: web-search
      url: https://search.mcp.internal/sse
      headers:
        Authorization:
          secretKeyRef:
            name: mcp-credentials
            key: token
  tools:
    - name: fetch_news
      description: "Fetch the latest news for a topic."
      url: http://news-api.internal/headlines
      method: POST
      inputSchema: '{"type":"object","properties":{"topic":{"type":"string"}},"required":["topic"]}'
  limits:
    maxTokensPerCall: 8000
    maxConcurrentTasks: 5
    timeoutSeconds: 120
    maxDailyTokens: 500000
  livenessProbe:
    type: semantic
    intervalSeconds: 30
    validatorPrompt: "Reply with exactly one word: HEALTHY"
  configRef:
    name: analyst-base
  memoryRef:
    name: research-memory
  notifyRef:
    name: on-degraded-slack

System prompt: inline vs. reference

Inline works for development and short prompts:

spec:
  systemPrompt: "You are a research assistant. Be thorough and cite sources."

Reference is required for production or prompts over 50 KB. The operator watches the ConfigMap or Secret and triggers a rolling restart when the content changes:

spec:
  systemPromptRef:
    configMapKeyRef:
      name: research-prompt
      key: system.txt

systemPromptRef takes precedence when both are set.

MCP servers

MCP servers extend the agent with tools at runtime. The agent connects via SSE at startup, discovers available tools, and exposes them to the LLM. Connection failures are non-fatal — the agent starts with a reduced toolset and logs the error.

Tool names are namespaced as {server-name}__{tool-name} to avoid collisions between servers.

Auth credentials stay in Secrets — the operator resolves them before injecting into pods:

mcpServers:
  - name: search
    url: https://search.mcp.internal/sse
    headers:
      Authorization:
        secretKeyRef:
          name: mcp-creds
          key: token

Inline webhook tools

For simple HTTP integrations, skip the MCP server entirely and define tools inline. The agent calls the URL when the LLM invokes the tool:

tools:
  - name: get_weather
    description: "Get current weather for a city."
    url: http://weather-api.internal/current
    method: GET
    inputSchema: '{"type":"object","properties":{"city":{"type":"string"}}}'

Built-in tools

Available in every agent pod regardless of MCP or webhook tool configuration:

Tool	Description
`submit_subtask`	Enqueue a new agent task asynchronously. Returns the task ID. Enables supervisor/worker patterns without a full `ArkTeam`.
`delegate`	Injected in `ArkTeam` context only. Routes a task to a specific team role and blocks until the role returns a result.

Daily token budget

limits.maxDailyTokens enforces a rolling 24-hour cap. Enforcement is two-layered:

Agent-side — the runtime checks accumulated token usage before each LLM call and rejects tasks immediately when the budget is exhausted.
Operator-side — the reconciler scales replicas to 0 as a backstop on the next reconcile cycle.

Replicas are automatically restored when the 24-hour window rotates.

Semantic health checks

Standard Kubernetes probes cannot tell whether the LLM is producing useful output. Setting livenessProbe.type: semantic enables a /readyz endpoint that calls the LLM with a validation prompt on each probe. When /readyz returns 503, ArkService stops routing tasks to that pod.

Spec reference

`spec`

Field	Type	Required	Default	Description
`replicas`	int32	no	`1`	Number of agent pod replicas. Range: 0–50.
`model`	string	yes	—	LLM model ID. Drives provider auto-detection.
`systemPrompt`	string	one of	—	Inline system prompt text.
`systemPromptRef`	SystemPromptSource	one of	—	Reference to a ConfigMap or Secret key. Takes precedence over `systemPrompt`.
`mcpServers`	[]MCPServerSpec	no	—	MCP tool servers connected at pod startup.
`tools`	[]WebhookToolSpec	no	—	Inline HTTP webhook tools.
`limits`	ArkonisLimits	no	—	Per-agent resource and token limits.
`livenessProbe`	ArkonisProbe	no	—	Semantic health check configuration.
`configRef`	LocalObjectReference	no	—	Name of an `ArkSettings` in the same namespace.
`memoryRef`	LocalObjectReference	no	—	Name of an `ArkMemory` in the same namespace.
`notifyRef`	LocalObjectReference	no	—	Name of an `ArkNotify` policy for `AgentDegraded` events.

`spec.systemPromptRef`

Exactly one sub-field must be set.

Field	Description
`configMapKeyRef.name`	ConfigMap name in the same namespace.
`configMapKeyRef.key`	Key in the ConfigMap data.
`secretKeyRef.name`	Secret name in the same namespace.
`secretKeyRef.key`	Key in the Secret data.

`spec.mcpServers[]`

Field	Type	Required	Description
`name`	string	yes	Logical name. Tool names are prefixed: `name__toolname`.
`url`	string	yes	SSE endpoint URL.
`headers`	map[string]MCPHeaderValue	no	HTTP headers sent with every request.

mcpServers[].headers values

Field	Description
`value`	Literal header value.
`secretKeyRef.name`	Secret name.
`secretKeyRef.key`	Key in the Secret data.

`spec.tools[]`

Field	Type	Required	Default	Description
`name`	string	yes	—	Tool identifier exposed to the LLM.
`description`	string	no	—	Explains the tool’s purpose to the LLM.
`url`	string	yes	—	HTTP endpoint the agent calls.
`method`	string	no	`POST`	HTTP method: `GET`, `POST`, `PUT`, `PATCH`.
`inputSchema`	string	no	—	JSON Schema (raw JSON string) for tool parameters.

`spec.limits`

Field	Type	Default	Description
`maxTokensPerCall`	int	`8000`	Token budget (input + output) per LLM call.
`maxConcurrentTasks`	int	`5`	Max tasks a single pod processes simultaneously.
`timeoutSeconds`	int	`120`	Per-task deadline in seconds.
`maxDailyTokens`	int64	`0` (no limit)	Rolling 24-hour token cap. Scales replicas to 0 when reached; auto-resumes when the window rotates.

`spec.livenessProbe`

Field	Type	Default	Description
`type`	string	—	`ping` — HTTP reachability only. `semantic` — enables LLM output validation via `/readyz`.
`intervalSeconds`	int	`30`	Probe interval in seconds.
`validatorPrompt`	string	(built-in)	Prompt sent during semantic validation.

`status`

Field	Type	Description
`replicas`	int32	Total pods managed by this agent.
`readyReplicas`	int32	Pods passing liveness and readiness checks.
`dailyTokenUsage`	TokenUsage	Rolling 24-hour token usage (when `limits.maxDailyTokens` is set).
`observedGeneration`	int64	The `.metadata.generation` this status reflects.
`conditions`	[]Condition	`Available`, `Progressing`, `Degraded`, `BudgetExceeded`.

Provider auto-detection

Model prefix	Provider
`claude-*`	Anthropic (`ANTHROPIC_API_KEY`)
`gpt-`, `o1-`, `o3-*`	OpenAI (`OPENAI_API_KEY`)
anything else	OpenAI-compatible (`OPENAI_API_KEY` + `OPENAI_BASE_URL`)

Override with AGENT_PROVIDER env var.

ArkAgent

What an agent is

Example

System prompt: inline vs. reference

MCP servers

Inline webhook tools

Built-in tools

Daily token budget

Semantic health checks

Spec reference

spec

spec.systemPromptRef

spec.mcpServers[]

spec.tools[]

spec.limits

spec.livenessProbe

status