Unlock the Power of AI: Creating Your First AI Agent Using Semantic Kernel

Rajeev Singh

1 year ago

In my previous blog, we explored the concept of an AI Agent. For more details on what an AI Agent is and how to create your first one, refer to the blog Create Your First AI Agent with Azure AI Service

We also delved into Multi-Agent systems. To learn more about this, check out the blog Kickstart Your Journey with Multi-Agent Systems: Kickstart Your Journey with Multi-Agent Systems: Build Your First Multi-Agent Using Python and Azure OpenAI

We created a Multi-Agent system and examined various options for building it without an orchestration layer, managing the agents manually.

In this post, we will dive into the Multi-Agent Orchestration Framework.

Table of Contents

Toggle

Multi-agent orchestration frameworks

Azure AI Agent Service works out-of-the-box with multi-agent orchestration frameworks that are wireline compatible with the Assistants API, such as AutoGen, a state-of-the-art research SDK for Python created by Microsoft Research, and Semantic Kernel, an enterprise AI SDK for Python, .NET, and Java.

When building a new multi-agent solution, start with building singleton agents with Azure AI Agent Service to get the most reliable, scalable, and secure agents.

You can then orchestrate these agents together AutoGen is constantly evolving to find the best collaboration patterns for agents (and humans) to work together. Features that show production value with AutoGen can then be moved into Semantic Kernel if you’re looking for production support and non-breaking changes.

If you want to explore more on this, refer to this post from Microsoft:

Azure AI Agent Service: Revolutionizing AI Agent Development and Deployment

Note: In this post, we will get started with Semantic Kernal.

What is an Orchestration layer in Developing AI solutions?

Application Design for AI Workloads on Azure – Microsoft Azure Well-Architected Framework | Microsoft Learn

Orchestration and agents

For generative AI workloads, adopting an agent-based, or agentic, approach can enhance the extensibility of your orchestration. Agents offer context-specific functionality and share many traits with microservices, performing tasks alongside an orchestrator.

The orchestrator can either advertise tasks to a pool of agents or allow agents to register their capabilities. Both methods enable the orchestrator to dynamically determine how to divide and route queries among agents.

Agentic approaches are particularly effective when you have a common user interface with multiple, evolving features that can be integrated to add more skills and grounding data over time. For complex workloads involving numerous agents, it is more efficient to let agents collaborate dynamically rather than having an orchestrator break up and assign tasks.

Communication between the orchestrator and agents should follow a topic-queue pattern, where agents subscribe to a topic and the orchestrator sends out tasks via a queue. An agentic approach works best with an orchestration pattern.

Where does this Orchestration Layer fit in with AI workload?

Now we know what these Orchestration frameworks are and when to use them, let’s understand where does it fit in with the overall GenAI solution?

Typical architecture pattern and design areas

See below, this diagram shows a typical architecture pattern and design areas and illustrates how data flows through the system from initial collection to final user interaction.

The architecture highlights the integration of different components to enable efficient data processing, model optimization, and real-time application deployment in AI-driven solutions. It includes modules such as data sources, data processing, model training, model deployment, and user interfaces.

We will be focusing on the Orchestration layer. But before that, we will cover all the components first.

Data Sources: These include structured, unstructured, semi-structured, and big data streams. Data sources are the origin of the raw data that will be processed and used by the AI system.
Data Ingestion Platform: This component is responsible for collecting data from various sources and preparing it for further processing. It ensures that data is ingested in a timely and efficient manner.
Data Aggregation Store: After ingestion, data is stored in an aggregated form, often in a data lake. This store acts as a central repository for all collected data.
Data Processing Platform: This platform processes the aggregated data, performing tasks such as cleaning, transformation, and enrichment. It may also involve calls to inferencing endpoints for real-time processing.
Feature Store: Processed data is stored in the feature store, which is used for model training and inferencing. The feature store ensures that data is readily available and in the correct format for AI models.
Offline Inference Store: This store holds batch-calculated predictions that can be used for future analysis and decision making.
Augmentation Data: This includes search indexes and databases that provide additional context and information to enhance the AI models’ performance.
User Interaction (Memory) Store: This component stores user interaction data, which can be used to personalize and improve the AI system’s responses.

To know more about this refer, AI workloads on Azure – Microsoft Azure Well-Architected Framework | Microsoft Learn

We will be covering various aspects of AI workload, design and assessment approach some other day (another blog I mean 😊)!

Orchestration Layer Flow

Let’s take an example with the above diagram to understand how the orchestration layer works in a Generative AI (GenAI) solution using Retrieval-Augmented Generation (RAG):

Model Deployment: The model is published to the inference hosting platform in advance.
Client Invocation: The client application invokes the orchestration layer.
Data Augmentation: The orchestration layer retrieves augmentation data to enhance the model’s performance.
Context Retrieval: It gathers context from the user’s interaction history.
External Services: If needed, the orchestration layer calls external services.
Model Invocation: The orchestration layer invokes the model via the gateway.
Request Forwarding: The gateway forwards the request to the model.

Select resources for generative AI workloads (Scenario: RAG)

Resource selection recommendations for AI workloads on Azure – Cloud Adoption Framework | Microsoft Learn

Now we know where the Orchestration Layer fits into the overall GenAI design. We will explore resource selection recommendations for organizations running AI workloads on Azure (Azure PaaS).

Generative AI requires the combination of different resources to process and generate meaningful outputs based on input data. Proper selection ensures that generative AI applications, such as those using retrieval augmented generation (RAG), deliver accurate by grounding AI models.

#	Component	Description	Examples
1	UI Layer	The user interface where users interact with the AI solution.	ASP.net React Angular
2	Orchestration Layer	Manages the workflow and coordinates between different components.	Semantic Kernel AutoGen LangChain
3	Search and Retrieval	Retrieves relevant data and information to support AI tasks.	Azure AI Search Vector db
4	Data Sources	The repositories where data is stored and accessed.	Azure SQL Database Azure Cosmos DB Azure Blob Storage
5	GenAI AI Platform	The platform where AI models are developed, trained, and deployed.	Azure AI Foundry Azure OpenAI

What is Semantic Kernel?

Refer to this blog for all the details on What is Multi-Agent, Why do we need an Orchestration Layer? What is Semantic Kernel? Kickstart Your Journey with Multi-Agent Systems: Build Your First Multi-Agent Using Python and Azure OpenAI

Semantic Kernel integrates LLMs like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java.

Using the SDK, developers can create “plugins” to interface with the LLMs and perform various tasks.

The Semantic Kernel SDK acts as a bridge between AI capabilities and traditional code, which helps simplify the process of developing AI-powered applications. Developers can easily utilize LLMs in their own applications without having to learn the intricacies the model’s API.

The kernel is the central component of the Semantic Kernel. The kernel acts as a dependency injection container that manages all of the services and plugins needed to run your AI application. This provides developers with a centralized location to configure and monitor their AI agents. For example, suppose you invoke a prompt from the kernel. The kernel will perform the following actions:

Select the best AI service to run the prompt.
Build the prompt using the provided prompt template.
Send the prompt to the AI service.
Receive and parse the response.
Return the response from the LLM to your application.

Throughout this entire process, you can create events and middleware that are triggered at any of these steps. This means you can perform actions like logging, provide status updates to users, and implement responsible AI.

Understanding How Semantic Kernel works?

Let’s take an example, and suppose you invoke a prompt from the kernel. The kernel will perform the following actions:

Select the best AI service to run the prompt.
Build the prompt using the provided prompt template.
Send the prompt to the AI service.
Receive and parse the response.
Return the response from the LLM to your application.

Throughout this entire process, you can create events and middleware that are triggered at any of these steps. This means you can perform actions like logging, provide status updates to users, and implement responsible AI.

What is semantic kernel – Training | Microsoft Learn

We will focus here on how to get started with SK.

Getting development started with Semantic Kernal

Semantic Kernal SDK

Semantic Kernel is an SDK that integrates Large Language Models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. Semantic Kernel achieves this by allowing you to define plugins that can be chained together in just a few lines of code.

GitHub url to get started with SK: microsoft/semantic-kernel: Integrate cutting-edge LLM technology quickly and easily into your apps

Demo Overview

We will start with the Hello World demo and understand how to develop this program using SK. In case you want to explore another example, check Microsoft QuickStart guide from here: How to quickly start with Semantic Kernel | Microsoft Learn

In this post, the second example we will cover our previous example used in creating Multi-Agent, Yes, the famous 😊 FoodAgent and MealSuggestionAgent.

Demo1: Hello World using Semantic Kernal

Demo2: Multi Agent (FoodAgent and MealSuggestionAgent) using Semantic Kernal

Demo1: Hello World using Semantic Kernal

Prerequisites:

VSCode
Install the SDK

dotnet add package Microsoft.SemanticKernel

Add Logging packages (Optional)

Semantic Kernel Tools – Visual Studio Marketplace

Create a new .NET Console project using this command:

dotnet new console

Code

Below is the complete code for your First Program using SK.

This code is in C# and uses below components.

// 1. Import packages
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;

using System.ComponentModel;
using System.Text.Json.Serialization;

using System.Diagnostics;

// Populate values from your OpenAI deployment
var modelId = "gpt-4o";
var endpoint = "https://openai-sk-demo.openai.azure.com/";
var apiKey = "<Your API Key>";

// 2. Create a kernel with Azure OpenAI chat completion
var builder = Kernel.CreateBuilder().AddAzureOpenAIChatCompletion(modelId, endpoint, apiKey);

// 3. Add enterprise components (Add Telemetry)
builder.Services.AddLogging(services => services.AddConsole().SetMinimumLevel(LogLevel.Trace));

// 4. Build the kernel (Core Kernal)
Kernel kernel = builder.Build();
var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

// 6. Add a plugin (the LightsPlugin class is defined below)
//kernel.Plugins.AddFromType<LightsPlugin>("Lights");

// 6. Add plugins for custom agents (Hello World)
kernel.Plugins.AddFromType<HelloWorldChatbotPlugin>("HelloWorldChatbot");

// 6. Add plugins for custom agents (Hello World)
//kernel.Plugins.AddFromType<FoodAgent>("FoodAgent");
//kernel.Plugins.AddFromType<MealSuggestionAgent>("MealSuggestionAgent");


// 9. Enable planning
var openAIPromptExecutionSettings = new OpenAIPromptExecutionSettings() 
{
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};

// 10. Invoke: Create a history store the conversation
// Get the response from the AI
var history = new ChatHistory();

// Initiate a back-and-forth chat
string? userInput;
do {
    // Collect user input
    Console.Write("User > ");
    userInput = Console.ReadLine();

    // Add user input
    history.AddUserMessage(userInput);

    // Get the response from the AI
    var result = await chatCompletionService.GetChatMessageContentAsync(
        history,
        executionSettings: openAIPromptExecutionSettings,
        kernel: kernel);

    // Print the results
    Console.WriteLine("Assistant > " + result);

    // Add the message from the agent to the chat history
    history.AddMessage(result.Role, result.Content ?? string.Empty);
} while (userInput is not null);



public class HelloWorldChatbotPlugin
{
    [KernelFunction("greet")]
    public string Greet()
    {
        return "Hello Rajeev! How can I assist you today?";
    }

    [KernelFunction("bye")]
    public string Bye()
    {
        return "Goodbye Rajeev! Have a great day!";
    }
}

Output

Run the program and start with below prompts and see the output.

Flow of the Program

Initialization: The program initializes the kernel with Azure OpenAI chat completion and adds the custom intents plugin.
User Input: The program enters a loop where it prompts the user for input.
Processing Input: The user input is added to the chat history, and the chat completion service generates a response based on the input and the defined intents.
Generating Response: The response from the AI is printed to the console, and the conversation continues until the user exits by providing no input.

This setup allows the chatbot to respond to specific intents like “Hello” and “Bye” with custom messages, making it interactive and useful.

Explanation of the Code and Flow

1. Setting Up the Project

First, you need to create a new C# console application

dotnet new console

Add the Semantic Kernel NuGet package to your project:

dotnet add package Microsoft.SemanticKernel

2. Imports and Initialization

The necessary namespaces are imported, and the Semantic Kernel is initialized with Azure OpenAI chat completion:

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using System.ComponentModel;
using System.Text.Json.Serialization;
using System.Diagnostics;

3. Main Program

The Main method is the entry point of the application. It performs the following steps:

Populate OpenAI Deployment Values: Set the model ID, endpoint, and API key for the Azure OpenAI service.

var modelId = "<Your model name>";
var endpoint = "<Your endpoint>"
var apiKey = "<Your API Key>";

Create and Build the Kernel: Initialize the kernel with Azure OpenAI chat completion and build it.

// 2. Create a kernel with Azure OpenAI chat completion
var builder = Kernel.CreateBuilder().AddAzureOpenAIChatCompletion(modelId, endpoint, apiKey);

// 4. Build the kernel (Core Kernal)
Kernel kernel = builder.Build();
var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

Add Custom Intents Plugin: Add a plugin that defines custom intents (greet and bye).

// 6. Add plugins for custom agents (Hello World)
kernel.Plugins.AddFromType<HelloWorldChatbotPlugin>("HelloWorldChatbot");

Enable Planning: Configure the execution settings for the chat completion service.

// 9. Enable planning
var openAIPromptExecutionSettings = new OpenAIPromptExecutionSettings() 
{
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};

Create Chat History: Initialize a ChatHistory object to store the conversation.

// 10. Invoke: Create a history store the conversation
// Get the response from the AI
var history = new ChatHistory();

Chat Loop: Enter a loop to handle user input and generate responses from the AI.

// Initiate a back-and-forth chat
string? userInput;
do {
    // Collect user input
    Console.Write("User > ");
    userInput = Console.ReadLine();

    // Add user input
    history.AddUserMessage(userInput);

    // Get the response from the AI
    var result = await chatCompletionService.GetChatMessageContentAsync(
        history,
        executionSettings: openAIPromptExecutionSettings,
        kernel: kernel);

    // Print the results
    Console.WriteLine("Assistant > " + result);

    // Add the message from the agent to the chat history
    history.AddMessage(result.Role, result.Content ?? string.Empty);
} while (userInput is not null);

4. Custom Intents Plugin

The ChatbotPlugin class defines two methods: Greet and Bye, which handle the “Hello” and “Bye” intents, respectively. These methods are annotated with [KernelFunction] to indicate that they are functions that can be invoked by the kernel.

public class HelloWorldChatbotPlugin
{
    [KernelFunction("greet")]
    public string Greet()
    {
        return "Hello Rajeev! How can I assist you today?";
    }

    [KernelFunction("bye")]
    public string Bye()
    {
        return "Goodbye Rajeev! Have a great day!";
    }

Demo2: Multi Agent (FoodAgent and MealSuggestionAgent) using Semantic Kernal

Great job on completing your first code with Semantic Kernel (SK)!

Now, let’s dive into another example to explore how SK can be beneficial in enhancing your AI agent’s capabilities.

This will help you understand the practical applications and advantages of using SK in your projects.

Ready to get started?

Code

All the code remains the same, and the only change you do is below:

Add Plugins for custom agents

// Add plugins for custom agents
kernel.Plugins.AddFromType<FoodAgent>("FoodAgent");
kernel.Plugins.AddFromType<MealSuggestionAgent>("MealSuggestionAgent");

Agent Code

public class FoodAgent
{
    [KernelFunction("get_food_info")]
    public string GetFoodInfo(string food)
    {
        // Simulate fetching food information
        return $"The food {food} is rich in vitamins and minerals.";
    }

    [KernelFunction("get_calories")]
    public string GetCalories(string food)
    {
        // Simulate fetching calorie information
        return $"The food {food} contains approximately 200 calories per serving.";
    }
}

public class MealSuggestionAgent
{
    [KernelFunction("suggest_meal")]
    public string SuggestMeal(string preference)
    {
        // Simulate suggesting a meal based on preference
        if (preference.Contains("vegetarian", StringComparison.OrdinalIgnoreCase))
        {
            return "How about a delicious vegetarian stir-fry with tofu and vegetables?";
        }
        else if (preference.Contains("low-carb", StringComparison.OrdinalIgnoreCase))
        {
            return "How about a grilled chicken salad with a variety of fresh greens?";
        }
        else
        {
            return "How about a classic spaghetti Bolognese with a side of garlic bread?";
        }
    }

    [KernelFunction("suggest_snack")]
    public string SuggestSnack()
    {
        // Simulate suggesting a snack
        return "How about some fresh fruit or a handful of nuts for a healthy snack?";
    }
}

Explanation of the Multi-Agent System

1. FoodAgent

The FoodAgent class handles tasks related to food information and calorie content. It has two functions:

GetFoodInfo: Provides information about the nutritional value of a specified food.
GetCalories: Provides the calorie content of a specified food.

2. MealSuggestionAgent

The MealSuggestionAgent class handles tasks related to meal and snack suggestions. It has two functions:

SuggestMeal: Suggests a meal based on the user’s dietary preference (e.g., vegetarian, low-carb).
SuggestSnack: Suggests a healthy snack.

3. Main Program

The main program remains largely the same, but now it includes the addition of the FoodAgent and MealSuggestionAgent plugins:

kernel.Plugins.AddFromType<FoodAgent>(“FoodAgent”);

kernel.Plugins.AddFromType<MealSuggestionAgent>(“MealSuggestionAgent”);

Behind the screen:

When you run the program, it will behave as a multi-agent chatbot that can handle different tasks related to food information and meal suggestions. Here’s what you can expect:

Prompt for Input: The program will continuously prompt you to enter a message by displaying “User > ” in the console.
Process Input: After you type a message and press Enter, the program will process your input to determine the appropriate response based on the defined functions in the FoodAgent and MealSuggestionAgent.
Invoke Functions: Depending on your input, the program will invoke the relevant function from the appropriate agent. For example:
- If you ask for food information, it will use the GetFoodInfo function from the FoodAgent.
- If you ask for a meal suggestion, it will use the SuggestMeal function from the MealSuggestionAgent.
Respond: The chatbot will respond with the output from the invoked function. For example:
- If you type “Tell me about apples”, it might respond with “The food apple is rich in vitamins and minerals.”
- If you type “Suggest a vegetarian meal”, it might respond with “How about a delicious vegetarian stir-fry with tofu and vegetables?”
Loop: This process will repeat, allowing you to continue the conversation and ask different questions until you enter an empty message (just press Enter without typing anything), which will break the loop and end the program.

Sample Interaction

Here’s a sample interaction to illustrate the behavior:

User > Tell me about apples

Assistant > The food apple is rich in vitamins and minerals.

User > How many calories are in a banana?

Assistant > The food banana contains approximately 200 calories per serving.

User > Suggest a vegetarian meal

Assistant > How about a delicious vegetarian stir-fry with tofu and vegetables?

User > Suggest a snack

Assistant > How about some fresh fruit or a handful of nuts for a healthy snack?

User >

In this interaction, the chatbot uses the FoodAgent to provide information about apples and bananas, and the MealSuggestionAgent to suggest a vegetarian meal and a snack.

Feel free to try different inputs and see how the chatbot responds based on the defined functions. If you have any other questions or need further assistance, just let me know!

Understand specific user query:

If you ask the chatbot “What should I eat right now?”, the program will process your input and determine which function to invoke based on the available agents and their functions.

Since this input is related to meal suggestions, the chatbot will likely use the MealSuggestionAgent to provide a response.

Here’s how the program might handle this input:

User Input: “What should I eat right now?”
Determine Intent: The program recognizes that this input is related to meal suggestions.
Invoke Function: The program invokes the SuggestMeal function from the MealSuggestionAgent.
Generate Response: The SuggestMeal function provides a meal suggestion based on the default logic or any specific preferences mentioned in the input.

Sample Response

Given the current implementation, the response might be something like:

User > What should I eat right now?

Assistant > How about a classic spaghetti Bolognese with a side of garlic bread?

If you want the chatbot to provide more personalized suggestions based on specific preferences, you can enhance the SuggestMeal function to handle different scenarios.

For example, you could add logic to check for keywords like “vegetarian” or “low-carb” in the user’s input and provide appropriate suggestions.

Conclusion

Creating AI agents with the Semantic Kernel SDK is a powerful way to leverage the capabilities of large language models in your applications. By following the steps outlined in this document, you can get started with SK and you can build reliable, scalable, and secure AI agents that can handle various tasks and provide valuable insights.

The integration of LLMs with traditional programming languages simplifies the development process, making it accessible to developers with different levels of expertise.

Whether you’re building a simple chatbot or a complex multi-agent system, the Semantic Kernel provides the tools and flexibility needed to bring your AI solutions to life.