How AI Agent mims Work

An AI Agent is more than a chatbot. It's an autonomous system that can reason, call functions, and take actions to accomplish tasks. This guide explains the architecture of an AI Agent mim and the concepts that make it work.

Why Build AI Agents as mims?

You can build AI agents with frameworks like LangChain. So why use mimOE + agent-kit SDK?

Dynamic Deployment

mims deploy programmatically via the MCM API. Update, restart, or replace agents on remote devices without manual intervention.

Automatic Discovery

mims discover other nodes automatically. Your agent can reach MCP servers on other devices without hardcoding URLs. The mesh provides context about what's available.

Cross-Platform, Zero Setup

Can you run a LangChain agent on a mobile phone? Or deploy seamlessly to an Nvidia Jetson? mimOE runs the same agent code across platforms without environment setup or dependency management.

Decoupled from UI

Most AI agents couple the agent logic with the UI (think CLI-based agents). mims force separation:

A simple UI can discover AI agents on the network
The UI points its chat endpoint to any agent on any node
This enables always-on AI agents that multiple clients can connect to

Your agent becomes a service, not an application.

AI Agent mims as discoverable services - any UI can discover and connect

The Big Picture

When you build an AI Agent on mimOE, you're building a mim that orchestrates everything. Here's the architecture:

AI Agent mim architecture showing HTTP requests, LLM inference, MCP tools, and Knowledge Source

The AI Agent (your mim) sits at the center. It:

Receives HTTP requests containing user prompts
Calls the LLM for reasoning (inference)
Calls MCP Tools to take actions (function calling)
Retrieves context from Knowledge Sources (RAG, secured data)
Returns HTTP responses with the agent's answer

What You Configure

Building an AI Agent mim means implementing four things:

1. API Handler

Your mim exposes HTTP endpoints. Implementing a standard protocol (like OpenAI chat completions or A2A) makes your agent compatible with existing clients and tooling:

// OpenAI-compatible chat completions endpoint
mimik.handle('POST', '/v1/chat/completions', async (request) => {
  const { messages } = request.body;
  const userMessage = messages[messages.length - 1].content;

  const response = await agent.run(userMessage);

  // Return OpenAI-compatible response
  return {
    status: 200,
    body: {
      choices: [{ message: { role: 'assistant', content: response } }]
    }
  };
});

2. Instructions

Instructions define who the agent is and how it behaves. This is where you encode the agent's logic in natural language:

You are a network assistant for smart home devices.

When users ask about devices:
1. Use discoverLocal to find devices on the network
2. Report what you find clearly
3. Offer to help with device-specific tasks

Be helpful, concise, and technical when needed.

Key insight: Instructions are logic. You don't write code to define behavior. You describe what you want, and the LLM figures out how to execute it using available tools.

3. MCP Endpoints

MCP (Model Context Protocol) endpoints tell the agent where to find tools:

mcpEndpoints: [
  'http://localhost:8080/superdrive/v1/mcp',  // File tools
  'http://localhost:8080/network/v1/mcp',     // Network tools
]

The agent connects to these endpoints, discovers available tools, and uses them when needed.

4. Context Retrieval

Context retrieval logic defines how your agent fetches relevant information from knowledge sources and injects it into the instructions:

// Retrieve context from mkb (RAG)
const relevantDocs = await mkb.search(userQuery);

// Retrieve user data from edis
const userData = await edis.get(userId);

// Inject into instructions as external knowledge
const instructions = `You are a network assistant.

## External Knowledge
${relevantDocs}

## User Context
${JSON.stringify(userData)}

Use the knowledge above when answering questions.`;

The retrieved context becomes part of the agent's instructions, giving the LLM access to domain-specific knowledge, user data, or any information not in its training data.

Knowledge Sources available:

mkb (mimik knowledge base): Lightweight database for RAG (Retrieval Augmented Generation)
edis: Key-value database for secured data

What Powers It: agent-kit

@mimik/agent-kit provides the agentic loop that powers your AI Agent:

Agentic Loop

How the Loop Works

User sends a message via HTTP request
LLM reasons about the task and available tools
LLM decides to call a function (tool)
Function executes and returns results
Results fed back to the LLM
Loop continues until the task is complete
Final response returned via HTTP

This loop is what makes agents powerful. The LLM isn't just generating text; it's actively solving problems through function calling.

Why Loops Matter

Without the loop, an LLM can only respond based on its training data. With the loop:

Dynamic information: Query APIs, databases, and live systems
Multi-step tasks: Break complex problems into steps
Error recovery: Try alternative approaches if something fails
Real-world actions: Control devices, send messages, create files

Function Calling

Function calling (also called tool use) is the capability that enables LLMs to invoke external functions. This is the foundation of agentic AI.

When the LLM decides to call a function:

It outputs structured data specifying which function and what arguments
Your application executes the function
Results are returned to the LLM
The LLM continues reasoning with the new information

Example Flow

User: "Find devices on my network"

LLM thinks: "I should use the discoverLocal function"
LLM outputs: { function: "discoverLocal", args: { type: "linkLocal" } }

Function executes → Returns: [{ name: "Living Room Speaker" }, { name: "Kitchen Hub" }]

LLM thinks: "I found 2 devices, let me tell the user"
LLM outputs: "I found 2 devices on your network: Living Room Speaker and Kitchen Hub."

MCP: The Tool Protocol

MCP (Model Context Protocol) is the standard protocol for exposing functions to AI agents. It defines how agents discover and call tools.

Model Context Protocol

How MCP Works

Operation	Description
`initialize`	Establish connection, exchange capabilities
`tools/list`	Discover available tools and their schemas
`tools/call`	Execute a tool with arguments

Tool Definition

Each tool has a name, description, and parameter schema:

{
  "name": "discoverLocal",
  "description": "Discover devices on the local network",
  "inputSchema": {
    "type": "object",
    "properties": {
      "type": {
        "type": "string",
        "description": "Discovery type: linkLocal or account"
      }
    },
    "required": ["type"]
  }
}

The LLM reads these descriptions to understand what tools are available and how to use them.

Multiple MCP Servers

You can connect to multiple MCP servers, each providing different capabilities:

Multiple MCP Servers

Putting It Into Code

Here's how all four pieces come together in an AI Agent mim:

const { Agent } = require('@mimik/agent-kit');

// Get runtime info
const { httpPort } = global.context.info;
const { INFERENCE_API_KEY } = global.context.env;

// 2. Instructions - base behavior (will be extended with context)
const baseInstructions = `You are a network assistant.
When users ask about devices, use discoverLocal to find them.
Report findings clearly and offer to help with next steps.`;

// 3. MCP Endpoints - where tools come from (with security)
const mcpEndpoints = [
  // Simple endpoint - all tools allowed
  'http://localhost:8080/network/v1/mcp',
  // Endpoint with tool whitelist - only allow safe tools
  {
    url: 'http://localhost:8080/superdrive/v1/mcp',
    options: {
      toolWhitelist: ['readFile', 'listDirectory'],
      whitelistMode: 'include'  // Only these tools allowed
    }
  }
];

// 1. API Handler - process HTTP request/response
mimik.handle('POST', '/chat', async (request) => {
  try {
    const { prompt, userContext } = request.body;

    // 4. Context Retrieval - inject dynamic context into instructions
    const instructions = `${baseInstructions}

## User Context
${JSON.stringify(userContext || {})}

Use the context above when answering questions.`;

    // Create agent with dynamic instructions
    const agent = new Agent({
      instructions,
      mcpEndpoints,
      llm: {
        endpoint: `http://127.0.0.1:${httpPort}/mimik-ai/openai/v1/chat/completions`,
        apiKey: `Bearer ${INFERENCE_API_KEY}`,
        model: 'qwen3-1.7b',
      },
      httpClient: global.http
    });

    // Run with runtime approval callback for additional security
    const stream = await agent.run(prompt, {
      toolApproval: async (toolCalls) => ({
        stopAfterExecution: false,
        approvals: toolCalls.map(tool => {
          // Block any destructive operations at runtime
          if (tool.function.name.includes('delete')) {
            return { approve: false, reason: 'Destructive operations require manual approval' };
          }
          return true;
        })
      })
    });

    let response = '';
    for await (const event of stream) {
      if (event.type === 'content_delta') {
        response += event.data.content;
      }
    }

    return { status: 200, body: { message: response } };
  } catch (error) {
    return { status: 500, body: { error: error.message } };
  }
});

What agent-kit Handles

Concept	agent-kit Implementation
Instructions	`instructions` config option
Context	Automatic (manages history and tool results)
MCP	Connects to `mcpEndpoints`, discovers tools
Agentic Loop	`agent.run()` loops until complete
Streaming	Real-time events for responsive UX
Security	Tool whitelisting, approval callbacks

Multi-Turn Conversations

agent-kit does not maintain conversation history across run() calls. For multi-turn conversations, store the message history externally and pass it to run() as an array of OpenAI-format messages:

// Store conversation history externally
const history = [];

// Add user message
history.push({ role: 'user', content: userMessage });

// Pass full history to agent
const stream = await agent.run(history);

// Add assistant response to history
history.push({ role: 'assistant', content: response });

Summary

Component	What It Is	Your Responsibility
AI Agent	The mim you build	Configure the four pieces below
API Handler	HTTP request/response	Parse prompts, return responses
Instructions	System prompt	Define agent behavior in natural language
MCP Endpoints	Tool sources	Point to MCP servers with tools
Context Retrieval	Knowledge access logic	Fetch from mkb (RAG) or edis (data)
agent-kit	Agentic loop library	Just configure and call `agent.run()`
LLM	Reasoning engine	Configure endpoint and model
MCP Tools	Functions to call	Build or use existing MCP servers

Next Steps

Create an AI Agent: Build your first AI Agent mim step-by-step
Agent Kit Reference: Full API documentation
Multi-Agent Systems: Coordinate AI agents across devices using Mesh Foundation (tutorial coming soon)

Why Build AI Agents as mims?​

Dynamic Deployment​

Automatic Discovery​

Cross-Platform, Zero Setup​

Decoupled from UI​

The Big Picture​

What You Configure​

1. API Handler​

2. Instructions​

3. MCP Endpoints​

4. Context Retrieval​

What Powers It: agent-kit​

How the Loop Works​

Why Loops Matter​

Function Calling​

Example Flow​

MCP: The Tool Protocol​

How MCP Works​

Tool Definition​

Multiple MCP Servers​

Putting It Into Code​

What agent-kit Handles​

Summary​

Next Steps​

Why Build AI Agents as mims?

Dynamic Deployment

Automatic Discovery

Cross-Platform, Zero Setup

Decoupled from UI

The Big Picture

What You Configure

1. API Handler

2. Instructions

3. MCP Endpoints

4. Context Retrieval

What Powers It: agent-kit

How the Loop Works

Why Loops Matter

Function Calling

Example Flow

MCP: The Tool Protocol

How MCP Works

Tool Definition

Multiple MCP Servers

Putting It Into Code

What agent-kit Handles

Summary

Next Steps