How AI Agent mims Work
An AI Agent is more than a chatbot. It's an autonomous system that can reason, call functions, and take actions to accomplish tasks. This guide explains the architecture of an AI Agent mim and the concepts that make it work.
Why Build AI Agents as mims?
You can build AI agents with frameworks like LangChain. So why use mimOE + agent-kit SDK?
Dynamic Deployment
mims deploy programmatically via the MCM API. Update, restart, or replace agents on remote devices without manual intervention.
Automatic Discovery
mims discover other nodes automatically. Your agent can reach MCP servers on other devices without hardcoding URLs. The mesh provides context about what's available.
Cross-Platform, Zero Setup
Can you run a LangChain agent on a mobile phone? Or deploy seamlessly to an Nvidia Jetson? mimOE runs the same agent code across platforms without environment setup or dependency management.
Decoupled from UI
Most AI agents couple the agent logic with the UI (think CLI-based agents). mims force separation:
- A simple UI can discover AI agents on the network
- The UI points its chat endpoint to any agent on any node
- This enables always-on AI agents that multiple clients can connect to
Your agent becomes a service, not an application.
The Big Picture
When you build an AI Agent on mimOE, you're building a mim that orchestrates everything. Here's the architecture:
The AI Agent (your mim) sits at the center. It:
- Receives HTTP requests containing user prompts
- Calls the LLM for reasoning (inference)
- Calls MCP Tools to take actions (function calling)
- Retrieves context from Knowledge Sources (RAG, secured data)
- Returns HTTP responses with the agent's answer
What You Configure
Building an AI Agent mim means implementing four things:
1. API Handler
Your mim exposes HTTP endpoints. Implementing a standard protocol (like OpenAI chat completions or A2A) makes your agent compatible with existing clients and tooling:
// OpenAI-compatible chat completions endpoint
mimik.handle('POST', '/v1/chat/completions', async (request) => {
const { messages } = request.body;
const userMessage = messages[messages.length - 1].content;
const response = await agent.run(userMessage);
// Return OpenAI-compatible response
return {
status: 200,
body: {
choices: [{ message: { role: 'assistant', content: response } }]
}
};
});
2. Instructions
Instructions define who the agent is and how it behaves. This is where you encode the agent's logic in natural language:
You are a network assistant for smart home devices.
When users ask about devices:
1. Use discoverLocal to find devices on the network
2. Report what you find clearly
3. Offer to help with device-specific tasks
Be helpful, concise, and technical when needed.
Key insight: Instructions are logic. You don't write code to define behavior. You describe what you want, and the LLM figures out how to execute it using available tools.
3. MCP Endpoints
MCP (Model Context Protocol) endpoints tell the agent where to find tools:
mcpEndpoints: [
'http://localhost:8080/superdrive/v1/mcp', // File tools
'http://localhost:8080/network/v1/mcp', // Network tools
]
The agent connects to these endpoints, discovers available tools, and uses them when needed.
4. Context Retrieval
Context retrieval logic defines how your agent fetches relevant information from knowledge sources and injects it into the instructions:
// Retrieve context from mkb (RAG)
const relevantDocs = await mkb.search(userQuery);
// Retrieve user data from edis
const userData = await edis.get(userId);
// Inject into instructions as external knowledge
const instructions = `You are a network assistant.
## External Knowledge
${relevantDocs}
## User Context
${JSON.stringify(userData)}
Use the knowledge above when answering questions.`;
The retrieved context becomes part of the agent's instructions, giving the LLM access to domain-specific knowledge, user data, or any information not in its training data.
Knowledge Sources available:
- mkb (mimik knowledge base): Lightweight database for RAG (Retrieval Augmented Generation)
- edis: Key-value database for secured data
What Powers It: agent-kit
@mimik/agent-kit provides the agentic loop that powers your AI Agent:
How the Loop Works
- User sends a message via HTTP request
- LLM reasons about the task and available tools
- LLM decides to call a function (tool)
- Function executes and returns results
- Results fed back to the LLM
- Loop continues until the task is complete
- Final response returned via HTTP
This loop is what makes agents powerful. The LLM isn't just generating text; it's actively solving problems through function calling.
Why Loops Matter
Without the loop, an LLM can only respond based on its training data. With the loop:
- Dynamic information: Query APIs, databases, and live systems
- Multi-step tasks: Break complex problems into steps
- Error recovery: Try alternative approaches if something fails
- Real-world actions: Control devices, send messages, create files
Function Calling
Function calling (also called tool use) is the capability that enables LLMs to invoke external functions. This is the foundation of agentic AI.
When the LLM decides to call a function:
- It outputs structured data specifying which function and what arguments
- Your application executes the function
- Results are returned to the LLM
- The LLM continues reasoning with the new information
Example Flow
User: "Find devices on my network"
LLM thinks: "I should use the discoverLocal function"
LLM outputs: { function: "discoverLocal", args: { type: "linkLocal" } }
Function executes → Returns: [{ name: "Living Room Speaker" }, { name: "Kitchen Hub" }]
LLM thinks: "I found 2 devices, let me tell the user"
LLM outputs: "I found 2 devices on your network: Living Room Speaker and Kitchen Hub."
MCP: The Tool Protocol
MCP (Model Context Protocol) is the standard protocol for exposing functions to AI agents. It defines how agents discover and call tools.
How MCP Works
| Operation | Description |
|---|---|
initialize | Establish connection, exchange capabilities |
tools/list | Discover available tools and their schemas |
tools/call | Execute a tool with arguments |
Tool Definition
Each tool has a name, description, and parameter schema:
{
"name": "discoverLocal",
"description": "Discover devices on the local network",
"inputSchema": {
"type": "object",
"properties": {
"type": {
"type": "string",
"description": "Discovery type: linkLocal or account"
}
},
"required": ["type"]
}
}
The LLM reads these descriptions to understand what tools are available and how to use them.
Multiple MCP Servers
You can connect to multiple MCP servers, each providing different capabilities:
Putting It Into Code
Here's how all four pieces come together in an AI Agent mim:
const { Agent } = require('@mimik/agent-kit');
// Get runtime info
const { httpPort } = global.context.info;
const { INFERENCE_API_KEY } = global.context.env;
// 2. Instructions - base behavior (will be extended with context)
const baseInstructions = `You are a network assistant.
When users ask about devices, use discoverLocal to find them.
Report findings clearly and offer to help with next steps.`;
// 3. MCP Endpoints - where tools come from (with security)
const mcpEndpoints = [
// Simple endpoint - all tools allowed
'http://localhost:8080/network/v1/mcp',
// Endpoint with tool whitelist - only allow safe tools
{
url: 'http://localhost:8080/superdrive/v1/mcp',
options: {
toolWhitelist: ['readFile', 'listDirectory'],
whitelistMode: 'include' // Only these tools allowed
}
}
];
// 1. API Handler - process HTTP request/response
mimik.handle('POST', '/chat', async (request) => {
try {
const { prompt, userContext } = request.body;
// 4. Context Retrieval - inject dynamic context into instructions
const instructions = `${baseInstructions}
## User Context
${JSON.stringify(userContext || {})}
Use the context above when answering questions.`;
// Create agent with dynamic instructions
const agent = new Agent({
instructions,
mcpEndpoints,
llm: {
endpoint: `http://127.0.0.1:${httpPort}/mimik-ai/openai/v1/chat/completions`,
apiKey: `Bearer ${INFERENCE_API_KEY}`,
model: 'qwen3-1.7b',
},
httpClient: global.http
});
// Run with runtime approval callback for additional security
const stream = await agent.run(prompt, {
toolApproval: async (toolCalls) => ({
stopAfterExecution: false,
approvals: toolCalls.map(tool => {
// Block any destructive operations at runtime
if (tool.function.name.includes('delete')) {
return { approve: false, reason: 'Destructive operations require manual approval' };
}
return true;
})
})
});
let response = '';
for await (const event of stream) {
if (event.type === 'content_delta') {
response += event.data.content;
}
}
return { status: 200, body: { message: response } };
} catch (error) {
return { status: 500, body: { error: error.message } };
}
});
What agent-kit Handles
| Concept | agent-kit Implementation |
|---|---|
| Instructions | instructions config option |
| Context | Automatic (manages history and tool results) |
| MCP | Connects to mcpEndpoints, discovers tools |
| Agentic Loop | agent.run() loops until complete |
| Streaming | Real-time events for responsive UX |
| Security | Tool whitelisting, approval callbacks |
agent-kit does not maintain conversation history across run() calls. For multi-turn conversations, store the message history externally and pass it to run() as an array of OpenAI-format messages:
// Store conversation history externally
const history = [];
// Add user message
history.push({ role: 'user', content: userMessage });
// Pass full history to agent
const stream = await agent.run(history);
// Add assistant response to history
history.push({ role: 'assistant', content: response });
Summary
| Component | What It Is | Your Responsibility |
|---|---|---|
| AI Agent | The mim you build | Configure the four pieces below |
| API Handler | HTTP request/response | Parse prompts, return responses |
| Instructions | System prompt | Define agent behavior in natural language |
| MCP Endpoints | Tool sources | Point to MCP servers with tools |
| Context Retrieval | Knowledge access logic | Fetch from mkb (RAG) or edis (data) |
| agent-kit | Agentic loop library | Just configure and call agent.run() |
| LLM | Reasoning engine | Configure endpoint and model |
| MCP Tools | Functions to call | Build or use existing MCP servers |
Next Steps
- Create an AI Agent: Build your first AI Agent mim step-by-step
- Agent Kit Reference: Full API documentation
- Multi-Agent Systems: Coordinate AI agents across devices using Mesh Foundation (tutorial coming soon)