Skip to main content

How AI Agent mims Work

An AI Agent is more than a chatbot. It's an autonomous system that can reason, call functions, and take actions to accomplish tasks. This guide explains the architecture of an AI Agent mim and the concepts that make it work.


Why Build AI Agents as mims?

You can build AI agents with frameworks like LangChain. So why use mimOE + agent-kit SDK?

Dynamic Deployment

mims deploy programmatically via the MCM API. Update, restart, or replace agents on remote devices without manual intervention.

Automatic Discovery

mims discover other nodes automatically. Your agent can reach MCP servers on other devices without hardcoding URLs. The mesh provides context about what's available.

Cross-Platform, Zero Setup

Can you run a LangChain agent on a mobile phone? Or deploy seamlessly to an Nvidia Jetson? mimOE runs the same agent code across platforms without environment setup or dependency management.

Decoupled from UI

Most AI agents couple the agent logic with the UI (think CLI-based agents). mims force separation:

  • A simple UI can discover AI agents on the network
  • The UI points its chat endpoint to any agent on any node
  • This enables always-on AI agents that multiple clients can connect to

Your agent becomes a service, not an application.

AI Agent mims as discoverable services - any UI can discover and connect


The Big Picture

When you build an AI Agent on mimOE, you're building a mim that orchestrates everything. Here's the architecture:

AI Agent mim architecture showing HTTP requests, LLM inference, MCP tools, and Knowledge Source

The AI Agent (your mim) sits at the center. It:

  • Receives HTTP requests containing user prompts
  • Calls the LLM for reasoning (inference)
  • Calls MCP Tools to take actions (function calling)
  • Retrieves context from Knowledge Sources (RAG, secured data)
  • Returns HTTP responses with the agent's answer

What You Configure

Building an AI Agent mim means implementing four things:

1. API Handler

Your mim exposes HTTP endpoints. Implementing a standard protocol (like OpenAI chat completions or A2A) makes your agent compatible with existing clients and tooling:

// OpenAI-compatible chat completions endpoint
mimik.handle('POST', '/v1/chat/completions', async (request) => {
const { messages } = request.body;
const userMessage = messages[messages.length - 1].content;

const response = await agent.run(userMessage);

// Return OpenAI-compatible response
return {
status: 200,
body: {
choices: [{ message: { role: 'assistant', content: response } }]
}
};
});

2. Instructions

Instructions define who the agent is and how it behaves. This is where you encode the agent's logic in natural language:

You are a network assistant for smart home devices.

When users ask about devices:
1. Use discoverLocal to find devices on the network
2. Report what you find clearly
3. Offer to help with device-specific tasks

Be helpful, concise, and technical when needed.

Key insight: Instructions are logic. You don't write code to define behavior. You describe what you want, and the LLM figures out how to execute it using available tools.

3. MCP Endpoints

MCP (Model Context Protocol) endpoints tell the agent where to find tools:

mcpEndpoints: [
'http://localhost:8080/superdrive/v1/mcp', // File tools
'http://localhost:8080/network/v1/mcp', // Network tools
]

The agent connects to these endpoints, discovers available tools, and uses them when needed.

4. Context Retrieval

Context retrieval logic defines how your agent fetches relevant information from knowledge sources and injects it into the instructions:

// Retrieve context from mkb (RAG)
const relevantDocs = await mkb.search(userQuery);

// Retrieve user data from edis
const userData = await edis.get(userId);

// Inject into instructions as external knowledge
const instructions = `You are a network assistant.

## External Knowledge
${relevantDocs}

## User Context
${JSON.stringify(userData)}

Use the knowledge above when answering questions.`;

The retrieved context becomes part of the agent's instructions, giving the LLM access to domain-specific knowledge, user data, or any information not in its training data.

Knowledge Sources available:

  • mkb (mimik knowledge base): Lightweight database for RAG (Retrieval Augmented Generation)
  • edis: Key-value database for secured data

What Powers It: agent-kit

@mimik/agent-kit provides the agentic loop that powers your AI Agent:

Agentic Loop

How the Loop Works

  1. User sends a message via HTTP request
  2. LLM reasons about the task and available tools
  3. LLM decides to call a function (tool)
  4. Function executes and returns results
  5. Results fed back to the LLM
  6. Loop continues until the task is complete
  7. Final response returned via HTTP

This loop is what makes agents powerful. The LLM isn't just generating text; it's actively solving problems through function calling.

Why Loops Matter

Without the loop, an LLM can only respond based on its training data. With the loop:

  • Dynamic information: Query APIs, databases, and live systems
  • Multi-step tasks: Break complex problems into steps
  • Error recovery: Try alternative approaches if something fails
  • Real-world actions: Control devices, send messages, create files

Function Calling

Function calling (also called tool use) is the capability that enables LLMs to invoke external functions. This is the foundation of agentic AI.

When the LLM decides to call a function:

  1. It outputs structured data specifying which function and what arguments
  2. Your application executes the function
  3. Results are returned to the LLM
  4. The LLM continues reasoning with the new information

Example Flow

User: "Find devices on my network"

LLM thinks: "I should use the discoverLocal function"
LLM outputs: { function: "discoverLocal", args: { type: "linkLocal" } }

Function executes → Returns: [{ name: "Living Room Speaker" }, { name: "Kitchen Hub" }]

LLM thinks: "I found 2 devices, let me tell the user"
LLM outputs: "I found 2 devices on your network: Living Room Speaker and Kitchen Hub."

MCP: The Tool Protocol

MCP (Model Context Protocol) is the standard protocol for exposing functions to AI agents. It defines how agents discover and call tools.

Model Context Protocol

How MCP Works

OperationDescription
initializeEstablish connection, exchange capabilities
tools/listDiscover available tools and their schemas
tools/callExecute a tool with arguments

Tool Definition

Each tool has a name, description, and parameter schema:

{
"name": "discoverLocal",
"description": "Discover devices on the local network",
"inputSchema": {
"type": "object",
"properties": {
"type": {
"type": "string",
"description": "Discovery type: linkLocal or account"
}
},
"required": ["type"]
}
}

The LLM reads these descriptions to understand what tools are available and how to use them.

Multiple MCP Servers

You can connect to multiple MCP servers, each providing different capabilities:

Multiple MCP Servers


Putting It Into Code

Here's how all four pieces come together in an AI Agent mim:

const { Agent } = require('@mimik/agent-kit');

// Get runtime info
const { httpPort } = global.context.info;
const { INFERENCE_API_KEY } = global.context.env;

// 2. Instructions - base behavior (will be extended with context)
const baseInstructions = `You are a network assistant.
When users ask about devices, use discoverLocal to find them.
Report findings clearly and offer to help with next steps.`;

// 3. MCP Endpoints - where tools come from (with security)
const mcpEndpoints = [
// Simple endpoint - all tools allowed
'http://localhost:8080/network/v1/mcp',
// Endpoint with tool whitelist - only allow safe tools
{
url: 'http://localhost:8080/superdrive/v1/mcp',
options: {
toolWhitelist: ['readFile', 'listDirectory'],
whitelistMode: 'include' // Only these tools allowed
}
}
];

// 1. API Handler - process HTTP request/response
mimik.handle('POST', '/chat', async (request) => {
try {
const { prompt, userContext } = request.body;

// 4. Context Retrieval - inject dynamic context into instructions
const instructions = `${baseInstructions}

## User Context
${JSON.stringify(userContext || {})}

Use the context above when answering questions.`;

// Create agent with dynamic instructions
const agent = new Agent({
instructions,
mcpEndpoints,
llm: {
endpoint: `http://127.0.0.1:${httpPort}/mimik-ai/openai/v1/chat/completions`,
apiKey: `Bearer ${INFERENCE_API_KEY}`,
model: 'qwen3-1.7b',
},
httpClient: global.http
});

// Run with runtime approval callback for additional security
const stream = await agent.run(prompt, {
toolApproval: async (toolCalls) => ({
stopAfterExecution: false,
approvals: toolCalls.map(tool => {
// Block any destructive operations at runtime
if (tool.function.name.includes('delete')) {
return { approve: false, reason: 'Destructive operations require manual approval' };
}
return true;
})
})
});

let response = '';
for await (const event of stream) {
if (event.type === 'content_delta') {
response += event.data.content;
}
}

return { status: 200, body: { message: response } };
} catch (error) {
return { status: 500, body: { error: error.message } };
}
});

What agent-kit Handles

Conceptagent-kit Implementation
Instructionsinstructions config option
ContextAutomatic (manages history and tool results)
MCPConnects to mcpEndpoints, discovers tools
Agentic Loopagent.run() loops until complete
StreamingReal-time events for responsive UX
SecurityTool whitelisting, approval callbacks
Multi-Turn Conversations

agent-kit does not maintain conversation history across run() calls. For multi-turn conversations, store the message history externally and pass it to run() as an array of OpenAI-format messages:

// Store conversation history externally
const history = [];

// Add user message
history.push({ role: 'user', content: userMessage });

// Pass full history to agent
const stream = await agent.run(history);

// Add assistant response to history
history.push({ role: 'assistant', content: response });

Summary

ComponentWhat It IsYour Responsibility
AI AgentThe mim you buildConfigure the four pieces below
API HandlerHTTP request/responseParse prompts, return responses
InstructionsSystem promptDefine agent behavior in natural language
MCP EndpointsTool sourcesPoint to MCP servers with tools
Context RetrievalKnowledge access logicFetch from mkb (RAG) or edis (data)
agent-kitAgentic loop libraryJust configure and call agent.run()
LLMReasoning engineConfigure endpoint and model
MCP ToolsFunctions to callBuild or use existing MCP servers

Next Steps

  • Create an AI Agent: Build your first AI Agent mim step-by-step
  • Agent Kit Reference: Full API documentation
  • Multi-Agent Systems: Coordinate AI agents across devices using Mesh Foundation (tutorial coming soon)