Skip to main content

AI Foundation Quick Start

Get mimOE running and make your first AI inference in minutes. One command installs everything and provisions a model, so you're ready to call the inference API.

System Requirements

macOS : macOS 10.15 (Catalina) or later : Apple Silicon processor

Linux : Ubuntu 22.04+ or equivalent distribution : x86_64 or ARM64 architecture : glibc 2.31 or later

Resource Requirements

mimOE itself is lightweight. RAM and disk requirements depend on the AI models you run. See Finding Models for model size guidelines. mimOE automatically uses available hardware acceleration (Metal on macOS, CUDA on NVIDIA GPUs, AVX2 on modern CPUs).

Step 1: Install

Run this one-liner in your terminal:

curl -L https://raw.githubusercontent.com/mimik-mimOE/mimOE-SE/main/install-mimOE-ai.sh | bash

This downloads the AI Foundation Package, installs it in the current directory, and starts mimOE automatically.

The script:

  • Downloads and installs the AI Foundation Package
  • Starts mimOE in the background (logs: logs/mimoe.log)
  • Provisions a default model (SmolLM2-360M, ~386MB)

Once complete, the API is available at http://localhost:8083 and you can run inference immediately.

Manual Install (Alternative)

If you prefer to install manually without the script:

  1. Download the mimOE runtime from mimOE-SE Releases
  2. Extract the archive
  3. Download the AI Foundation addon from mimOE-addon-ai-foundation Releases
  4. Place the .addon file in the addon/ directory
  5. Run the start script:
./start.sh
  1. Provision a model (see Upload Model Guide)

Step 2: Run Inference

Call the OpenAI-compatible chat completions API:

curl -X POST "http://localhost:8083/mimik-ai/openai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 1234" \
-d '{
"model": "smollm2-360m",
"messages": [
{
"role": "user",
"content": "Explain what edge computing is in one sentence."
}
]
}'

Expected response:

{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1702742400,
"model": "smollm2-360m",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Edge computing processes data locally on devices rather than sending it to centralized cloud servers."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 18,
"total_tokens": 33
}
}

Congratulations! You've just run AI inference completely on your device.

Step 3: Check Your Model

See what's loaded and how it's performing:

curl "http://localhost:8083/mimik-ai/openai/v1/models" \
-H "Authorization: Bearer 1234"
{
"data": [
{
"id": "smollm2-360m",
"object": "model",
"created": 1769534258,
"owned_by": "mimik",
"info": {
"kind": "llm",
"n_params": 361821120,
"max_context": 2048,
"n_embd": 960,
"model_size": 384618240
},
"metrics": {
"inference_count": 1,
"last_used": 1769534258,
"loaded_at": 1769534258,
"tokens_per_second": 227.43,
"avg_tokens_per_second": 227.43
}
}
],
"object": "list"
}

The info object describes the model architecture and the metrics object tracks runtime performance. See the Inference API reference for the full field breakdown.

What Just Happened?

The install script automated everything:

  1. Downloaded mimOE: The runtime is running in the background on port 8083
  2. Installed AI addon: Two API services are now available: : Model Registry: http://localhost:8083/mimik-ai/store/v1 : Inference: http://localhost:8083/mimik-ai/openai/v1
  3. Provisioned a model: SmolLM2-360M was downloaded and registered
  4. Ran inference: Your request was processed entirely on-device
  5. Checked your model: GET /models returned the loaded model with architecture details (info) and runtime metrics (metrics)

No cloud API calls, no external services, no data leaving your device.

Observability

The metrics object returned by GET /models gives you built-in observability for every loaded model. Track tokens_per_second, avg_tokens_per_second, and inference_count to monitor throughput and usage over time without any external tooling.

Authentication

API requests require a bearer token. The default API key is 1234:

Authorization: Bearer 1234
Changing the API Key

To change the API key, edit the .ini file in the addon/ directory and restart mimOE. See Addon Configuration for details.

Try More Examples

Use the OpenAI SDK:

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8083/mimik-ai/openai/v1",
api_key="1234"
)

response = client.chat.completions.create(
model="smollm2-360m",
messages=[
{"role": "user", "content": "Complete this sentence: AI is like a"}
]
)

print(response.choices[0].message.content)

Enable streaming:

curl -X POST "http://localhost:8083/mimik-ai/openai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 1234" \
-d '{
"model": "smollm2-360m",
"messages": [{"role": "user", "content": "Tell me a short story."}],
"stream": true
}'

Managing mimOE

mimOE runs in the background. Use these commands to manage it:

# View logs
tail -f logs/mimoe.log

# Stop mimOE
pkill -f mimoe

# Restart mimOE
./start.sh > logs/mimoe.log 2>&1 &

Uninstallation

To uninstall, simply delete the installation directory. No system files are installed outside the package directory.

Troubleshooting

Permission Denied (macOS/Linux)

Symptom: "Permission denied" when running ./start.sh

Solution:

chmod +x start.sh

Port Already in Use

Symptom: "Port 8083 is already in use"

Solution: Stop the process using port 8083, or specify a different port when starting mimOE.

Missing Dependencies (Linux)

Symptom: Error about missing shared libraries

Solution:

# Ubuntu/Debian
sudo apt-get update && sudo apt-get install -y libgomp1 libstdc++6

# RHEL/CentOS
sudo yum install -y libgomp libstdc++

Windows Defender Alert

Symptom: Windows Defender blocks the executable

Solution: Add an exception:

  1. Open Windows Security
  2. Go to Virus & threat protectionManage settings
  3. Add mimoe.exe to Exclusions

Model Download Fails

Symptom: Download returns an error or times out

Solution:

  1. Verify your internet connection
  2. Check that the model URL is correct
  3. Ensure you have enough disk space (~500MB free)

Slow First Inference

Symptom: First inference request takes 30+ seconds

Cause: Model loading into memory on first use

Solution: This is expected. Subsequent requests are faster. You can pre-load models to avoid user-facing latency.

Next Steps

API Reference

Custom Development

To create your own mims (microservices/AI agents) and bundles, see the Platform Guide.