Agent Control Plane Features¶
Xians is the open source Agent Control Plane (ACP) for production AI agents. It sits alongside your agent framework - not replacing it. It handles everything that becomes painful once agents graduate from demos to production: multi-tenant governance, business process orchestration, scalability, monitoring, and data management.
When you register an agent (built with any framework like Azure AI Projects, LangChain, Semantic Kernel, or OpenAI SDK) with Xians, it gains enterprise-grade control plane capabilities organized around five core functions:
1. Governance & Multi-Tenancy¶
- Agent Registry - Centralized registration, versioning, and lifecycle management
- Multi-Tenancy - Complete tenant isolation with centralized multi-user coordination
- Template-Based Deployment - Rollout agents to multiple tenants from a single control point
2. Business Process Automation¶
- Durable Workflows - Fault-tolerant processes that survive failures and span days, months, or years
- Scheduling - Time-based automation with cron, intervals, and calendar schedules
- Human-in-the-Loop - Automated workflows that pause for human review and approval
- Sub-Workflows - Composable, reusable workflow components
- Webhooks - Event-driven triggers for business process automation
3. Knowledge & Data Management¶
- Prompt Management - Centralized prompt and knowledge storage accessible from code and UI
- Document Storage - Tenant-scoped persistent storage for agent state and memory
- Conversation History - Hierarchical message organization with complete context preservation
4. Visibility & Monitoring¶
- Observability - Real-time logs, distributed tracing, and audit trails
- Performance Metrics - Response times, throughput, success rates, and bottleneck detection
- Cost Tracking - Token usage and API call monitoring across all agents and tenants
5. Scalability & Resilience¶
- Horizontal Scaling - Add agent workers dynamically with automatic load distribution
- Subnet Isolation - Workers run with no incoming ports, only outbound connections
- Fault Tolerance - Automatic retries, timeouts, and failure recovery
- Framework Agnostic - Mix agents built on different stacks in the same system
No changes to your agent's code. Because Xians is framework-agnostic, you can mix agents built on different stacks in the same system, all governed by a unified control plane.
Agent Registry¶
Part of: Governance & Multi-Tenancy
The control plane's agent registry provides centralized lifecycle management and coordination for your entire fleet.
Build with any framework - Microsoft Agent Framework, LangChain, Semantic Kernel, or raw OpenAI SDK - and register it with Xians. Each agent receives a unique identity and automatic integration with the control plane's governance, monitoring, and orchestration capabilities.
Key capabilities:
- Framework-agnostic: Bring your own agent implementation - the control plane doesn't care
- Centralized registry: Single source of truth for all agents across all tenants
- Version control: Track agent versions, configurations, and deployment history
- Lifecycle management: Publish, deploy, update, rollback, and decommission agents from one control point
Template-Based Deployment¶
Part of: Governance & Multi-Tenancy
Publish agents as reusable templates and rollout to multiple tenants with consistent configurations from a centralized control point.
Key capabilities:
- Agent templates: Define once, deploy to many tenants
- Configuration per tenant: Same agent, different prompts and configurations per tenant
- Centralized updates: Update templates and propagate changes across all deployments
- Deployment tracking: Monitor which tenants have which agent versions
- Rollback capability: Revert to previous versions if issues arise
Multi-Tenancy¶
Part of: Governance & Multi-Tenancy
The control plane provides complete isolation of agents, workflows, data, and conversations across tenants while sharing infrastructure efficiently. Coordinate multi-user operations within tenants and enforce governance across your entire organization. Two deployment models:
| Deployment Model | What It Means | When to Use |
|---|---|---|
| System-Scoped | One runtime serves all tenants | Common agents rolled out to multiple tenants |
| Tenant-Scoped | Dedicated runtime per tenant | Custom logic, dedicated resources, specialized agents |
Key capabilities:
- Tenant isolation: Each tenant's workflows, data, and conversations are completely separated
- Multi-user coordination: Multiple users within a tenant can interact with agents simultaneously
- Resource governance: Implement per-tenant quotas and rate limits
- Custom configurations: Same agent using different prompts or configurations per tenant
- Centralized management: Manage all tenants from a single control plane instance
Deploy system agents for common use cases, create tenant-scoped agents for custom requirements or resource isolation.
Agent-User Collaboration¶
Part of: Knowledge & Data Management
The control plane provides rich, asynchronous messaging between users and agents with persistent conversation history and hierarchical context management.
Messages aren't just text - they're the conversational memory that makes agents intelligent and contextual. Each message thread maintains state across sessions, enabling multi-turn dialogues where agents remember context, preferences, and history.
Xians provides a sophisticated message hierarchy with complete isolation at every level, allowing agents and users to scope conversations for their specific work:
graph TD
T[Tenant] -->|has many| A[Agents]
A -->|has many| W[Workflows]
W -->|has many| TH[Conversation Threads]
TH -->|has many| S[Topics/Scopes]
S -->|has many| M[Messages]
style T fill:#eea52d,stroke:#333,stroke-width:2px,color:#1b1f2f
style A fill:#538cfc,stroke:#333,stroke-width:2px,color:#fff
style W fill:#41c18a,stroke:#333,stroke-width:2px,color:#fff
style TH fill:#9b59b6,stroke:#333,stroke-width:2px,color:#fff
style S fill:#e74c3c,stroke:#333,stroke-width:2px,color:#fff
style M fill:#95a5a6,stroke:#333,stroke-width:2px,color:#fff This hierarchy enables powerful conversation organization: a single tenant can have multiple agents, each with different workflows handling various interactions. Within each workflow, users can maintain separate conversation threads, and even within a thread, organize messages by topic for cleaner context management
Key capabilities:
- Conversation hierachy: Automatic context preservation across sessions
- Multiple transports: WebSocket, Server-Sent Events (SSE), REST APIs
- Rich messages: Text, structured data, and HITL Tasks
- Security: Message encryption, EU AI Act compliant
- Authentication: API keys or OIDC/OAuth 2.0 integration
Users can interact with agents across different conversation threads, with full history and context automatically managed.
Agent-Agent Collaboration¶
Part of: Business Process Automation
The control plane orchestrates multiple agents working together to solve complex business problems through the Agent-to-Agent (A2A) protocol and workflow coordination.
Complex agentic systems often require multiple specialized agents collaborating as a team - one agent conversing with users, another analyzing data, one searching and reading the web, another making decisions. Xians enables sophisticated multi-agent architectures through two key mechanisms:
Multi-Workflow Agents: A single agent can contain multiple specialized workflows, each handling a specific responsibility (conversation, research, analysis, decision-making). These workflows operate as a coordinated team behind a unified agent interface.
Agent-to-Agent Protocol (A2A): Agents communicate with each other using Xians' A2A protocol SDK, which provides in-process communication for speed and resource optimization. Messages are routed, context is shared, and results are aggregated automatically.
graph TB
U[User] -->|Message| A1[Customer Support Agent]
subgraph A1_Team[Customer Support Agent - Multi-Workflow]
A1W1[Conversation Workflow]
A1W2[Analysis Workflow]
A1W3[Action Workflow]
end
A1 --> A1W1
A1W1 -->|A2A Protocol| A1W2
A1W2 -->|A2A Protocol| A1W3
A1W3 -->|A2A Protocol| A2[Knowledge Agent]
A1W3 -->|A2A Protocol| A3[Data Agent]
subgraph A2_Team[Knowledge Agent - Multi-Workflow]
A2W1[Search Workflow]
A2W2[RAG Workflow]
end
A2 --> A2W1
A2W1 --> A2W2
A2W2 -->|Results| A1W3
A3 -->|Results| A1W3
A1W3 --> A1W1
A1W1 -->|Response| U
style A1 fill:#538cfc,stroke:#333,stroke-width:3px,color:#fff
style A2 fill:#41c18a,stroke:#333,stroke-width:3px,color:#fff
style A3 fill:#e74c3c,stroke:#333,stroke-width:3px,color:#fff
style A1W1 fill:#7fb3ff,stroke:#333,stroke-width:1px,color:#000
style A1W2 fill:#7fb3ff,stroke:#333,stroke-width:1px,color:#000
style A1W3 fill:#7fb3ff,stroke:#333,stroke-width:1px,color:#000
style A2W1 fill:#6dd5a5,stroke:#333,stroke-width:1px,color:#000
style A2W2 fill:#6dd5a5,stroke:#333,stroke-width:1px,color:#000 This architecture enables building sophisticated agent teams where each agent specializes in a domain (customer support, data analysis, web research) and each workflow within an agent handles a specific task type.
Human-in-the-Loop¶
Part of: Business Process Automation
The control plane enables automated business workflows that pause for hours, days, or weeks waiting for human input, then automatically resume.
sequenceDiagram
participant W as Agent Workflow
participant T as Task
participant H as Human Reviewer
W->>T: Create Task with Draft
T->>H: Assigned To
Note over W: Workflow pauses<br/>(can wait hours/days/weeks)
H->>T: Review & Edit Draft in Task
H->>T: Approve/Reject Decision
T->>W: Resume with Feedback
Note over W: Workflow continues
W->>W: Process Next Steps Key capabilities:
- Task creation: Agents create tasks requiring human judgment
- Draft review: Human reviewers edit and approve agent outputs
- Flexible timing: Workflows can wait indefinitely for human input
- Approval chains: Multi-step review processes with multiple tasks
- Audit trails: Complete history of reviews and decisions
Agents create tasks, attach drafts, and wait for approval. Humans review, edit, approve, or reject. Workflow continues with the feedback. Everything tracked in the audit trail.
Long-Running Workflows¶
Part of: Business Process Automation
Built on Temporal, the control plane orchestrates fault-tolerant, durable workflows that handle complex business processes spanning days, months, or years with automatic state management and recovery.
Traditional automation breaks on long-running processes. Xians workflows are durable, maintaining state across restarts, failures, and deployments. A customer onboarding workflow can span weeks; an annual compliance workflow runs for months; a scheduled report runs every day for years - all reliably managed.
Key capabilities:
- Fault tolerance: Workflows survive agent failures, infrastructure restarts, and network outages
- Automatic retries: Transient failures are automatically retried with configurable policies
- State persistence: Maintain complete context across distributed operations and long time spans
- Composition: Orchestrate complex multi-step business processes with sub-workflows
- Timeout handling: Configure timeouts at every step to handle stuck processes
Each agent can have multiple workflows for different business processes: conversations, scheduled tasks, event handlers, or custom workflows. The "Default Workflow" gives you all platform functions out-of-the-box.
Fault Tolerance¶
Part of: Scalability & Resilience
The control plane's fault tolerance ensures business processes continue even when individual components fail.
Key capabilities:
- Automatic retries: Failed operations are retried automatically with exponential backoff
- Timeout policies: Prevent workflows from hanging indefinitely with configurable timeouts
- Graceful degradation: Workflows continue even when external services are temporarily unavailable
- State recovery: Workflow state is persisted and recovered automatically after crashes
- Partial failure handling: Handle failures in sub-workflows without failing the entire process
Scheduling¶
Part of: Business Process Automation
The control plane provides time-based business process automation with a modern, fluent API that works the same everywhere - in regular code, agent tools, and even inside workflows themselves.
Time-based automation enables autonomous business processes - generating reports at 9 AM, processing overnight data, sending weekly summaries, running monthly analytics - without manual triggers.
Key capabilities:
- Flexible scheduling: Cron expressions, intervals, daily/weekly/monthly helpers, or one-time calendar schedules
- Timezone support: Schedule in any timezone using IANA timezone database
- Workflow-aware: Same API works both inside and outside workflows with automatic determinism
- Full lifecycle management: Create, pause, resume, trigger, update, and delete schedules programmatically
- Dynamic creation: Agents create schedules based on conversations or business logic
- Multi-tenant isolation: Automatic tenant scoping and isolation
- Durable execution: Schedules survive restarts and system failures
Manage everything programmatically via SDK or through the UI. Built on Temporal's durable execution for reliability.
Webhooks¶
Part of: Business Process Automation
Event-driven triggers enable agents to respond to external events and integrate into broader business process automation.
Key capabilities:
- Inbound webhooks: Receive events from external systems to trigger agent workflows
- Outbound webhooks: Send agent events and results to external systems for process continuation
- Cross-platform integration: Connect agents with Microsoft 365, Kubernetes, third-party tools, and custom applications
- Event routing: Route events to the appropriate agents and workflows based on content and metadata
- Async processing: Webhook requests are queued and processed asynchronously for scalability
Document Storage¶
Part of: Knowledge & Data Management
The control plane provides persistent, tenant-scoped JSON document storage for agent state, memory, and data. Save user preferences, conversation memory, session state, analytics - any data your agents need to remember across workflow executions.
Agents need structured memory beyond conversations. Document storage provides flexible JSON persistence with semantic keys, metadata filtering, and automatic cleanup - without database complexity.
Key capabilities:
- JSON storage: Store any JSON-serializable object with no schema constraints
- Semantic keys: Use meaningful identifiers like "user-123-preferences" instead of random IDs
- Metadata filtering: Query by type, metadata fields, and date ranges
- TTL support: Auto-delete documents after expiration (sessions, caches, temporary data)
- Dual access: Available at agent-level and from within workflow contexts
- Tenant isolation: Automatic data separation per tenant - no cross-tenant data leakage
Common use cases: conversation memory, user preferences, session state, analytics tracking, API response caching, business process state. Simple, fast, and purpose-built for agent workflows.
Configuration Management¶
Part of: Knowledge & Data Management
Centralized prompt and knowledge management through the control plane. Both agents (via code) and humans (via UI) can read and write the same knowledge, enabling no-code updates to agent behavior.
Also known as Knowledge Management or Prompt Management.
Agents need more than just code - they need prompts, instructions, configs, and reference data that can be updated without redeployment. The control plane provides a centralized knowledge store accessible to both code and humans.
Key capabilities:
- Dual access: Agents use SDK methods, humans use UI portal - same data, synchronized
- Automatic scoping: Per-agent and per-tenant isolation
- Multiple content types: AI prompts, instructions, JSON configs, markdown docs, preferences
- Version tracking: Track changes to prompts and configurations over time
- Simple CRUD: Get, update, delete, and list operations via SDK or UI
- Fast retrieval: Automatic caching for performance
- No schema constraints: Store any text content
Common uses: AI prompts editable via UI, user preferences, feature flags, instructions, API configurations, templates, FAQ content. Update agent behavior without redeployment by changing prompts in the UI.
Sub-Workflows¶
Part of: Business Process Automation
The control plane supports composable workflow components that can be reused across different parent workflows to build modular business processes.
Complex business processes benefit from modularity. Sub-workflows are reusable building blocks - a "send email" sub-workflow, a "verify identity" sub-workflow, a "generate report" sub-workflow - that compose into larger processes.
Key capabilities:
- Reusability: Define once, use across multiple parent workflows and business processes
- Composition: Nest workflows for clean, maintainable architecture
- Independent scaling: Sub-workflows can have different worker pools for resource optimization
- Isolated testing: Test sub-workflows independently before composing
- Failure isolation: Sub-workflow failures can be handled without cascading to parent workflow
Build a library of business process components. Compose them into sophisticated multi-step processes. Maintain and test each piece separately.
Observability¶
Part of: Visibility & Monitoring
The control plane provides comprehensive logs, distributed tracing, and audit trails to monitor agent operations and debug issues across your entire fleet.
Key capabilities:
- Structured logs: Auto-captured with stack traces, searchable by agent/tenant/workflow/time in the UI
- Distributed tracing: OpenTelemetry support with correlation IDs to trace requests across agents and workflows
- Workflow history: Complete execution history for every workflow with state transitions
- Audit trails: Immutable execution history - every workflow, action, and decision timestamped
- Real-time monitoring: View logs and traces in real-time as workflows execute
- Integration: Export to Datadog, New Relic, Grafana, and other observability platforms
Debug complex multi-agent workflows by tracing requests across the entire system. View complete execution history to understand what happened and when.
Metrics & Usage Tracking¶
Part of: Visibility & Monitoring
The control plane provides comprehensive metrics tracking for everything your agents do - from LLM token usage to business outcomes - with zero configuration required.
Track agent work across technical (tokens, API calls), business (approvals, documents), and operational (HITL tasks) layers. The metrics system auto-captures context (tenant, user, workflow) so you focus on tracking what matters to your business.
Key capabilities:
- Automatic context: Every metric includes tenant, user, workflow, and agent attribution automatically
- Flexible tracking: Track any metric with any label - tokens, business outcomes, performance data
- Universal API: Same metrics API works in workflows, activities, and message handlers
- Smart routing: A2A-aware and workflow-aware with automatic determinism handling
- Custom correlation: Link metrics to your external systems with custom identifiers
Common patterns:
// Track LLM usage
await context.Metrics
.ForModel("gpt-4")
.WithMetrics(
("tokens", "prompt", 45, "tokens"),
("tokens", "completion", 105, "tokens")
)
.ReportAsync();
// Track business outcomes
await context.Metrics
.WithMetrics(
("approvals", "submitted", 1, "count"),
("documents", "generated", 1, "count"),
("emails", "sent", 3, "count")
)
.ReportAsync();
→ Complete Metrics Guide for detailed usage patterns and advanced features.
Performance Metrics¶
Part of: Visibility & Monitoring
Track operational performance in real-time to identify bottlenecks, optimize resource usage, and ensure SLAs are met.
| What We Track | Why |
|---|---|
| Response times, latency | Find bottlenecks and optimize performance |
| Throughput (workflows/sec) | Monitor load and capacity planning |
| Success/failure rates | Catch issues and track reliability |
| Queue depths | Detect backlog and scaling needs |
| Worker utilization | Optimize worker pool sizes |
Key capabilities:
- Real-time dashboards: Visualize performance across all agents and tenants
- Anomaly detection: Identify performance degradation and failures automatically
- SLA monitoring: Track agent reliability against service level objectives
- Bottleneck identification: Find slow workflows and optimize them
Cost Tracking¶
Part of: Visibility & Monitoring
The control plane tracks resource consumption and costs across your agent fleet in real-time, enabling cost optimization and chargeback.
| What We Track | Why |
|---|---|
| Token usage per agent/tenant | Control LLM costs and attribute to customers |
| API calls and volumes | Monitor consumption patterns |
| Cost per workflow | Identify expensive operations |
| Resource utilization | Optimize infrastructure costs |
Key capabilities:
- Per-tenant tracking: Attribute costs to specific tenants for chargeback and budgeting
- Budget alerts: Set spending limits and receive alerts when exceeded (planned)
- Cost optimization: Identify expensive agents and workflows for optimization
- Usage trends: Analyze cost trends over time to forecast spending
Horizontal Scaling¶
Part of: Scalability & Resilience
The control plane enables horizontal scaling by distributing work across multiple agent worker containers automatically.
Key capabilities:
- Dynamic worker pools: Add or remove worker containers on demand - work is automatically distributed
- Automatic load balancing: Temporal distributes tasks across available workers based on capacity
- No manual configuration: Workers auto-register when they start - no service discovery needed
- Per-tenant scaling: Scale different worker pools for different tenants or agent types
- Resource optimization: Scale workers independently from the control plane server
Simply deploy more worker containers and watch throughput increase linearly. Scale down during off-hours to save costs.
Subnet Isolation¶
Part of: Scalability & Resilience
Agent workers can run in private subnets with no incoming ports, only making outbound connections to pull tasks from the control plane.
Key capabilities:
- No incoming ports: Workers never accept inbound connections - they only pull tasks via outbound connections
- Simplified security: No need to expose agent workers to the internet or configure inbound firewall rules
- Pull-based architecture: Workers poll the Temporal queue for tasks, eliminating the need for service discovery
- Network isolation: Run workers in isolated subnets or VPCs with restricted network access
- Defense in depth: Reduced attack surface since workers can't be directly accessed
This architecture is ideal for security-conscious deployments where agents need to be isolated from external access.
Framework Agnostic¶
Part of: Scalability & Resilience
The control plane works with any agent framework or implementation approach, enabling you to choose the best tool for each job.
Key capabilities:
- No framework lock-in: Use Microsoft Agent Framework, LangChain, Semantic Kernel, OpenAI SDK, or custom implementations
- Mix and match: Run agents built on different frameworks within the same control plane
- Bring your own LLM: Use OpenAI, Azure OpenAI, Anthropic, local models, or any LLM provider
- Language support: Currently supports .NET agents, with Python and TypeScript SDKs planned
- Migration friendly: Migrate from one framework to another without changing the control plane
Build each agent with the framework that makes the most sense, then govern them all through a unified control plane.
The Control Plane Advantage¶
Xians is the Agent Control Plane that transforms AI agents from demos to production-grade systems. Keep your agent code focused on AI logic. Let the control plane handle multi-tenant governance, business process orchestration, horizontal scalability, monitoring, and data management.
Not another agent framework. The open source control plane that provides centralized, proactive control over your entire agent fleet - enabling reliable, scalable, and observable deployment of AI-powered business processes across your organization.