Chat Completion¶

The Xians platform provides a simple chat completion capability through the SemanticRouterHub.ChatCompletionAsync static method. This functionality allows you to send a direct prompt to an LLM and receive a response with optional system instructions, without the complexity of chat history or function calling.

Overview¶

The ChatCompletionAsync method leverages Microsoft's SemanticKernel framework to provide straightforward LLM interactions. It supports optional system instructions to guide the AI's behavior, but unlike the full routing capabilities, it doesn't include chat history or function calling support, focusing on simple prompt-to-response interactions.

Usage¶

var response = await SemanticRouterHub.ChatCompletionAsync(prompt, systemInstruction, routerOptions);

Parameters¶

prompt (string): The direct prompt to send to the LLM
systemInstruction (string, optional): System instruction to guide the AI's behavior. Defaults to empty string
routerOptions (RouterOptions, optional): Configuration options for the completion

Configuration Options¶

The RouterOptions class allows you to configure the LLM behavior:

var routerOptions = new RouterOptions
{
    ProviderName = "openai",           // or "azureopenai"
    ModelName = "gpt-4",
    ApiKey = "your-api-key",
    Temperature = 0.3,                 // Controls randomness (default: 0.3)
    MaxTokens = 1000,                  // Maximum response length (default: 10000)
    HTTPTimeoutSeconds = 300,          // HTTP timeout (default: 5 minutes)
    HistorySizeToFetch = 10,           // History size to fetch (default: 10)
    WelcomeMessage = "Hello!",         // Welcome message for empty prompts

    // Token limiting features (prevents context_length_exceeded errors)
    TokenLimit = 80000,                // Trigger reduction at 80k tokens (default: 80000)
    TargetTokenCount = 50000,          // Reduce to 50k tokens (default: 50000)
    MaxTokensPerFunctionResult = 10000 // Limit large function results (default: 10000)
};

Available Options¶

Property	Description	Default
`ProviderName`	LLM provider ("openai", "azureopenai")	From environment/settings
`ModelName`	Model to use (e.g., "gpt-4", "gpt-3.5-turbo")	From environment/settings
`DeploymentName`	Azure OpenAI deployment name	From environment/settings
`Endpoint`	Custom endpoint URL	From environment/settings
`ApiKey`	API key for the provider	From environment/settings
`Temperature`	Randomness control (0.0-1.0)	0.3
`MaxTokens`	Maximum tokens in response	10000
`HTTPTimeoutSeconds`	HTTP timeout in seconds	300 (5 minutes)
`HistorySizeToFetch`	Number of history messages to fetch	10
`WelcomeMessage`	Message sent for null/empty prompts	null
`TokenLimit`	Max tokens before history reduction	80000
`TargetTokenCount`	Target tokens after reduction	50000
`MaxTokensPerFunctionResult`	Max tokens per function result	10000

Example Usage¶

Here's a practical example from a chat interceptor that analyzes assistant messages:

// Basic usage
string prompt = @"Analyze the following assistant message and determine if it contains 
    a direct request for contract information: " + userMessage;

var analysis = await SemanticRouterHub.ChatCompletionAsync(prompt);

// With system instruction
string systemInstruction = "You are a helpful assistant that analyzes messages accurately.";
var detailedAnalysis = await SemanticRouterHub.ChatCompletionAsync(prompt, systemInstruction);

// With full options
var routerOptions = new RouterOptions { Temperature = 0.1, MaxTokens = 500 };
var preciseAnalysis = await SemanticRouterHub.ChatCompletionAsync(prompt, systemInstruction, routerOptions);

Key Characteristics¶

Optional System Instructions: Supports system instructions to guide AI behavior (defaults to empty string, but underlying implementation uses "You are a helpful assistant. Perform the user's request accurately and concisely." when null/empty)
No Chat History: Each call is independent with no conversation context
No Function Calling: Functions/tools are disabled for simple text responses
SemanticKernel Based: Uses Microsoft SemanticKernel framework internally
Provider Agnostic: Supports OpenAI and Azure OpenAI providers
Token Management: Built-in token limiting to prevent context length errors
Configurable Timeouts: Adjustable HTTP timeouts for different use cases

When to Use¶

Use ChatCompletionAsync when you need:

Simple prompt-to-response interactions
Text analysis or generation tasks
Independent LLM calls without conversation context
Quick LLM evaluations or classifications
System-guided responses without conversation history

For more complex scenarios involving system prompts, chat history, or function calling, use the full routing capabilities instead.