ArkAgent
API: arkonis.dev/v1alpha1
Kind: ArkAgent
Short name: arkagent
Scope: Namespaced
ArkAgent manages a pool of LLM agent instances. You declare what an agent knows, what it can do, and what it’s allowed to spend — the operator keeps that state running, healthy, and within budget.
What an agent is
An agent is a long-running process that:
- Reads configuration from environment variables injected by the operator
- Connects to configured MCP tool servers at startup
- Polls the task queue for work
- Calls the configured LLM provider with the task and available tools
- Runs the tool-use loop until the model stops invoking tools
- Returns the result to the queue
The agent binary (ark-runtime) has no Kubernetes dependencies. The same binary runs in-cluster and locally via ark run.
Example
apiVersion: arkonis.dev/v1alpha1
kind: ArkAgent
metadata:
name: research-agent
namespace: default
spec:
replicas: 2
model: llama3.2
systemPromptRef:
configMapKeyRef:
name: research-prompt
key: system.txt
mcpServers:
- name: web-search
url: https://search.mcp.internal/sse
headers:
Authorization:
secretKeyRef:
name: mcp-credentials
key: token
tools:
- name: fetch_news
description: "Fetch the latest news for a topic."
url: http://news-api.internal/headlines
method: POST
inputSchema: '{"type":"object","properties":{"topic":{"type":"string"}},"required":["topic"]}'
limits:
maxTokensPerCall: 8000
maxConcurrentTasks: 5
timeoutSeconds: 120
maxDailyTokens: 500000
livenessProbe:
type: semantic
intervalSeconds: 30
validatorPrompt: "Reply with exactly one word: HEALTHY"
configRef:
name: analyst-base
memoryRef:
name: research-memory
notifyRef:
name: on-degraded-slack
System prompt: inline vs. reference
Inline works for development and short prompts:
spec:
systemPrompt: "You are a research assistant. Be thorough and cite sources."
Reference is required for production or prompts over 50 KB. The operator watches the ConfigMap or Secret and triggers a rolling restart when the content changes:
spec:
systemPromptRef:
configMapKeyRef:
name: research-prompt
key: system.txt
systemPromptRef takes precedence when both are set.
MCP servers
MCP servers extend the agent with tools at runtime. The agent connects via SSE at startup, discovers available tools, and exposes them to the LLM. Connection failures are non-fatal — the agent starts with a reduced toolset and logs the error.
Tool names are namespaced as {server-name}__{tool-name} to avoid collisions between servers.
Auth credentials stay in Secrets — the operator resolves them before injecting into pods:
mcpServers:
- name: search
url: https://search.mcp.internal/sse
headers:
Authorization:
secretKeyRef:
name: mcp-creds
key: token
Inline webhook tools
For simple HTTP integrations, skip the MCP server entirely and define tools inline. The agent calls the URL when the LLM invokes the tool:
tools:
- name: get_weather
description: "Get current weather for a city."
url: http://weather-api.internal/current
method: GET
inputSchema: '{"type":"object","properties":{"city":{"type":"string"}}}'
Built-in tools
Available in every agent pod regardless of MCP or webhook tool configuration:
| Tool | Description |
|---|---|
submit_subtask | Enqueue a new agent task asynchronously. Returns the task ID. Enables supervisor/worker patterns without a full ArkTeam. |
delegate | Injected in ArkTeam context only. Routes a task to a specific team role and blocks until the role returns a result. |
Daily token budget
limits.maxDailyTokens enforces a rolling 24-hour cap. Enforcement is two-layered:
- Agent-side — the runtime checks accumulated token usage before each LLM call and rejects tasks immediately when the budget is exhausted.
- Operator-side — the reconciler scales replicas to 0 as a backstop on the next reconcile cycle.
Replicas are automatically restored when the 24-hour window rotates.
Semantic health checks
Standard Kubernetes probes cannot tell whether the LLM is producing useful output. Setting livenessProbe.type: semantic enables a /readyz endpoint that calls the LLM with a validation prompt on each probe. When /readyz returns 503, ArkService stops routing tasks to that pod.
Spec reference
spec
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
replicas | int32 | no | 1 | Number of agent pod replicas. Range: 0–50. |
model | string | yes | — | LLM model ID. Drives provider auto-detection. |
systemPrompt | string | one of | — | Inline system prompt text. |
systemPromptRef | SystemPromptSource | one of | — | Reference to a ConfigMap or Secret key. Takes precedence over systemPrompt. |
mcpServers | []MCPServerSpec | no | — | MCP tool servers connected at pod startup. |
tools | []WebhookToolSpec | no | — | Inline HTTP webhook tools. |
limits | ArkonisLimits | no | — | Per-agent resource and token limits. |
livenessProbe | ArkonisProbe | no | — | Semantic health check configuration. |
configRef | LocalObjectReference | no | — | Name of an ArkSettings in the same namespace. |
memoryRef | LocalObjectReference | no | — | Name of an ArkMemory in the same namespace. |
notifyRef | LocalObjectReference | no | — | Name of an ArkNotify policy for AgentDegraded events. |
spec.systemPromptRef
Exactly one sub-field must be set.
| Field | Description |
|---|---|
configMapKeyRef.name | ConfigMap name in the same namespace. |
configMapKeyRef.key | Key in the ConfigMap data. |
secretKeyRef.name | Secret name in the same namespace. |
secretKeyRef.key | Key in the Secret data. |
spec.mcpServers[]
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Logical name. Tool names are prefixed: name__toolname. |
url | string | yes | SSE endpoint URL. |
headers | map[string]MCPHeaderValue | no | HTTP headers sent with every request. |
mcpServers[].headers values
| Field | Description |
|---|---|
value | Literal header value. |
secretKeyRef.name | Secret name. |
secretKeyRef.key | Key in the Secret data. |
spec.tools[]
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
name | string | yes | — | Tool identifier exposed to the LLM. |
description | string | no | — | Explains the tool’s purpose to the LLM. |
url | string | yes | — | HTTP endpoint the agent calls. |
method | string | no | POST | HTTP method: GET, POST, PUT, PATCH. |
inputSchema | string | no | — | JSON Schema (raw JSON string) for tool parameters. |
spec.limits
| Field | Type | Default | Description |
|---|---|---|---|
maxTokensPerCall | int | 8000 | Token budget (input + output) per LLM call. |
maxConcurrentTasks | int | 5 | Max tasks a single pod processes simultaneously. |
timeoutSeconds | int | 120 | Per-task deadline in seconds. |
maxDailyTokens | int64 | 0 (no limit) | Rolling 24-hour token cap. Scales replicas to 0 when reached; auto-resumes when the window rotates. |
spec.livenessProbe
| Field | Type | Default | Description |
|---|---|---|---|
type | string | — | ping — HTTP reachability only. semantic — enables LLM output validation via /readyz. |
intervalSeconds | int | 30 | Probe interval in seconds. |
validatorPrompt | string | (built-in) | Prompt sent during semantic validation. |
status
| Field | Type | Description |
|---|---|---|
replicas | int32 | Total pods managed by this agent. |
readyReplicas | int32 | Pods passing liveness and readiness checks. |
dailyTokenUsage | TokenUsage | Rolling 24-hour token usage (when limits.maxDailyTokens is set). |
observedGeneration | int64 | The .metadata.generation this status reflects. |
conditions | []Condition | Available, Progressing, Degraded, BudgetExceeded. |
Provider auto-detection
| Model prefix | Provider |
|---|---|
claude-* | Anthropic (ANTHROPIC_API_KEY) |
gpt-*, o1-*, o3-* | OpenAI (OPENAI_API_KEY) |
| anything else | OpenAI-compatible (OPENAI_API_KEY + OPENAI_BASE_URL) |
Override with AGENT_PROVIDER env var.
See also
- ArkSettings —
spec.configRef - ArkMemory —
spec.memoryRef - ArkNotify —
spec.notifyRef - Environment Variables — all agent pod env vars
- Helm Values —
agentExtraEnv,apiKeys.*