Model Registry API

The Model Registry API (mModelStore) provides model storage and management capabilities for the AI Foundation Package. This service allows you to register, upload, download, and manage AI models in GGUF and ONNX formats.

Base URL

http://localhost:8083/mimik-ai/store/v1

info

The Model Registry service runs as part of the AI Foundation Package. The default port is 8083.

Authentication

Mutating operations (POST, PUT, DELETE) require a Bearer token in the Authorization header:

Authorization: Bearer 1234

The default API key is 1234, configured in the [mmodelstore-v1] section of the addon .ini file. See Addon Configuration for details.

Read operations (GET) do not require authentication.

Quick Reference

Method	Endpoint	Description
GET	`/models`	List all models
POST	`/models`	Create model metadata
PUT	`/models`	Update model configuration
GET	`/models/{id}`	Get model details
DELETE	`/models/{id}`	Delete model
POST	`/models/{id}/upload`	Upload model file
POST	`/models/{id}/download`	Download model from URL

Two-Step Provisioning

Model provisioning follows a two-step process:

Create metadata: Register the model with its configuration
Provision file: Either upload directly or download from URL

This separation allows you to configure the model before the file transfer, and supports both local uploads and remote downloads.

Model Kinds

The Model Registry supports four model kinds:

Kind	Description	File Format	Use Case
`llm`	Large Language Model	GGUF	Text generation, chat, reasoning
`vlm`	Vision Language Model	GGUF + mmproj	Multimodal (text + images)
`embed`	Embedding Model	GGUF	Text embeddings, semantic search
`onnx`	ONNX Model	ONNX	Image classification, predictive AI

Endpoints

Create Model

Create a new model entry with metadata. The model file is provisioned separately.

Request

POST /models

Headers

Header	Required	Value
`Content-Type`	Yes	`application/json`
`Authorization`	Yes	`Bearer <token>`

Request Body

Field	Type	Required	Description
`id`	string	Yes	Unique model identifier
`version`	string	Yes	Model version (metadata only)
`kind`	string	Yes	Model type: `llm`, `vlm`, `embed`, `onnx`
`gguf`	object	No	GGUF configuration (for llm, vlm, embed)
`onnx`	object	No	ONNX configuration (for onnx kind)

GGUF Configuration

Field	Type	Description
`chatTemplateHint`	string	Chat template format (see supported values below)
`initContextSize`	integer	Context window size for model initialization
`initGpuLayerSize`	integer	GPU layers to offload during initialization

ONNX Configuration

Field	Type	Description
`executionProvider`	string	Execution provider: `cpu`, `cuda`, `coreml`, `tensorrt`

Example: Create LLM

cURL
JavaScript
Python

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer 1234" \
  -d '{
    "id": "smollm2-360m",
    "version": "1.0.0",
    "kind": "llm",
    "gguf": {
      "chatTemplateHint": "chatml",
      "initContextSize": 2048,
      "initGpuLayerSize": 99
    }
  }'

const response = await fetch('http://localhost:8083/mimik-ai/store/v1/models', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer 1234'
  },
  body: JSON.stringify({
    id: 'smollm2-360m',
    version: '1.0.0',
    kind: 'llm',
    gguf: {
      chatTemplateHint: 'chatml',
      initContextSize: 2048,
      initGpuLayerSize: 99
    }
  })
});

const model = await response.json();
console.log('Model created:', model.id);

import requests

response = requests.post(
    "http://localhost:8083/mimik-ai/store/v1/models",
    headers={
        "Content-Type": "application/json",
        "Authorization": "Bearer 1234"
    },
    json={
        "id": "smollm2-360m",
        "version": "1.0.0",
        "kind": "llm",
        "gguf": {
            "chatTemplateHint": "chatml",
            "initContextSize": 2048,
            "initGpuLayerSize": 99
        }
    }
)

model = response.json()
print(f"Model created: {model['id']}")

Response (201 Created)

{
  "id": "smollm2-360m",
  "version": "1.0.0",
  "kind": "llm",
  "readyToUse": false,
  "createdAt": 1729591200000,
  "gguf": {
    "chatTemplateHint": "chatml",
    "initContextSize": 2048,
    "initGpuLayerSize": 99
  }
}

Idempotent Operation

Creating a model that already exists updates the metadata (returns 200 OK instead of 201 Created).

Upload Model File

Upload a model file directly via multipart form data. This is Step 2a of provisioning.

Request

POST /models/{id}/upload

Headers

Header	Required	Value
`Content-Type`	Yes	`multipart/form-data`
`Authorization`	Yes	`Bearer <token>`

Form Data

Field	Type	Required	Description
`file`	binary	Yes	The model file (.gguf or .onnx)

Example

cURL
JavaScript
Python

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m/upload" \
  -H "Authorization: Bearer 1234" \
  -F "file=@SmolLM2-360M-Instruct-Q8_0.gguf"

const formData = new FormData();
formData.append('file', fileInput.files[0]);

const response = await fetch('http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m/upload', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer 1234'
  },
  body: formData
});

const model = await response.json();
console.log('Model uploaded, ready:', model.readyToUse);

import requests

with open("SmolLM2-360M-Instruct-Q8_0.gguf", "rb") as f:
    response = requests.post(
        "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m/upload",
        headers={"Authorization": "Bearer 1234"},
        files={"file": f}
    )

model = response.json()
print(f"Model uploaded, ready: {model['readyToUse']}")

Response (200 OK)

{
  "id": "smollm2-360m",
  "version": "1.0.0",
  "kind": "llm",
  "readyToUse": true,
  "totalSize": 386000000,
  "createdAt": 1729591200000,
  "gguf": {
    "chatTemplateHint": "chatml",
    "initContextSize": 2048,
    "initGpuLayerSize": 99
  }
}

Download Model File

Download a model file from a URL. This is Step 2b of provisioning. Returns a Server-Sent Events stream with download progress.

Request

POST /models/{id}/download

Headers

Header	Required	Value
`Content-Type`	Yes	`application/json`
`Authorization`	Yes	`Bearer <token>`
`Accept`	No	`text/event-stream` (for SSE progress)

Request Body

Field	Type	Required	Description
`url`	string	Yes	URL to download the model file
`mmprojUrl`	string	No	URL for VLM multimodal projection file

Example: Download LLM

cURL
JavaScript

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m/download" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer 1234" \
  -d '{
    "url": "https://huggingface.co/lmstudio-community/SmolLM2-360M-Instruct-GGUF/resolve/main/SmolLM2-360M-Instruct-Q8_0.gguf?download=true"
  }'

const response = await fetch('http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m/download', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer 1234'
  },
  body: JSON.stringify({
    url: 'https://huggingface.co/lmstudio-community/SmolLM2-360M-Instruct-GGUF/resolve/main/SmolLM2-360M-Instruct-Q8_0.gguf?download=true'
  })
});

// Handle SSE stream for progress
const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');

  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = JSON.parse(line.slice(6));
      if (data.done) {
        console.log('Download complete:', data.model);
      } else {
        const percent = ((data.size / data.totalSize) * 100).toFixed(1);
        console.log(`Progress: ${percent}%`);
      }
    }
  }
}

SSE Response Stream

data: {"size": 100000000, "totalSize": 386000000}

data: {"size": 250000000, "totalSize": 386000000}

data: {"size": 386000000, "totalSize": 386000000}

data: {"done": true, "model": {"id": "smollm2-360m", "readyToUse": true, ...}}

Example: Download VLM with mmproj

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models/llava-1.6/download" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer 1234" \
  -d '{
    "url": "https://example.com/llava-model.gguf",
    "mmprojUrl": "https://example.com/llava-mmproj.gguf"
  }'

VLM downloads include mmproj progress:

data: {"size": 1048576000, "totalSize": 1048576000}
data: {"mmproj": {"size": 26214400, "totalSize": 52428800}}
data: {"done": true, "model": {...}}

Cancel Download

To cancel a download in progress, disconnect from the SSE stream. The server detects disconnection, stops the download, and removes partial files.

List Models

Retrieve all models in the registry.

Request

GET /models

Query Parameters

Parameter	Type	Description
`kind`	string	Filter by model type: `llm`, `vlm`, `embed`, `onnx`
`ready`	boolean	Filter by ready status: `true` or `false`

Example

cURL
JavaScript
Python

# List all models
curl "http://localhost:8083/mimik-ai/store/v1/models"

# List only LLM models
curl "http://localhost:8083/mimik-ai/store/v1/models?kind=llm"

# List only ready models
curl "http://localhost:8083/mimik-ai/store/v1/models?ready=true"

# Combine filters
curl "http://localhost:8083/mimik-ai/store/v1/models?kind=llm&ready=true"

// List all models
const response = await fetch('http://localhost:8083/mimik-ai/store/v1/models');
const data = await response.json();

console.log(`Found ${data.data.length} models`);
data.data.forEach(model => {
  console.log(`${model.id} (${model.kind}): ready=${model.readyToUse}`);
});

// Filter by kind
const llmResponse = await fetch('http://localhost:8083/mimik-ai/store/v1/models?kind=llm');
const llmModels = await llmResponse.json();

import requests

# List all models
response = requests.get("http://localhost:8083/mimik-ai/store/v1/models")
data = response.json()

print(f"Found {len(data['data'])} models")
for model in data['data']:
    print(f"{model['id']} ({model['kind']}): ready={model['readyToUse']}")

Response (200 OK)

{
  "data": [
    {
      "id": "smollm2-360m",
      "version": "1.0.0",
      "kind": "llm",
      "readyToUse": true,
      "totalSize": 386000000,
      "createdAt": 1729591200000,
      "gguf": {
        "chatTemplateHint": "chatml",
        "initContextSize": 2048,
        "initGpuLayerSize": 99
      }
    },
    {
      "id": "nomic-embed-text",
      "version": "1.0.0",
      "kind": "embed",
      "readyToUse": true,
      "totalSize": 274000000,
      "createdAt": 1729591300000,
      "gguf": {
        "initContextSize": 8192
      }
    }
  ]
}

Get Model Details

Retrieve detailed information about a specific model.

Request

GET /models/{id}

Query Parameters

Parameter	Value	Description
`alt`	`media`	Download the model file instead of metadata

Example

cURL
JavaScript
Python

# Get model metadata
curl "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m"

# Download model file
curl "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m?alt=media" -o model.gguf

const response = await fetch('http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m');

if (response.ok) {
  const model = await response.json();
  console.log(`Model: ${model.id}`);
  console.log(`Kind: ${model.kind}`);
  console.log(`Ready: ${model.readyToUse}`);
  console.log(`Size: ${(model.totalSize / 1024 / 1024).toFixed(0)} MB`);
} else {
  console.error('Model not found');
}

import requests

response = requests.get("http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m")

if response.status_code == 200:
    model = response.json()
    print(f"Model: {model['id']}")
    print(f"Kind: {model['kind']}")
    print(f"Ready: {model['readyToUse']}")
    print(f"Size: {model['totalSize'] / 1024 / 1024:.0f} MB")
else:
    print("Model not found")

Response (200 OK)

{
  "id": "smollm2-360m",
  "version": "1.0.0",
  "kind": "llm",
  "readyToUse": true,
  "totalSize": 386000000,
  "createdAt": 1729591200000,
  "gguf": {
    "chatTemplateHint": "chatml",
    "initContextSize": 2048,
    "initGpuLayerSize": 99
  }
}

Update Model

Update model configuration (metadata fields only).

Request

PUT /models

Headers

Header	Required	Value
`Content-Type`	Yes	`application/json`
`Authorization`	Yes	`Bearer <token>`

Request Body

Field	Type	Required	Description
`id`	string	Yes	Model identifier to update
`action`	string	Yes	Must be `"update"`
`gguf`	object	No	Updated GGUF configuration
`onnx`	object	No	Updated ONNX configuration

Example

curl -X PUT "http://localhost:8083/mimik-ai/store/v1/models" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer 1234" \
  -d '{
    "id": "smollm2-360m",
    "action": "update",
    "gguf": {
      "initContextSize": 4096,
      "initGpuLayerSize": 99
    }
  }'

Delete Model

Remove a model from the registry. This deletes the model entry and associated files.

Request

DELETE /models/{id}

Headers

Header	Required	Value
`Authorization`	Yes	`Bearer <token>`

Example

cURL
JavaScript
Python

curl -X DELETE "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m" \
  -H "Authorization: Bearer 1234"

const response = await fetch('http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m', {
  method: 'DELETE',
  headers: {
    'Authorization': 'Bearer 1234'
  }
});

if (response.ok) {
  console.log('Model deleted successfully');
}

import requests

response = requests.delete(
    "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m",
    headers={"Authorization": "Bearer 1234"}
)

if response.ok:
    print("Model deleted successfully")

Permanent Deletion

Deleting a model removes all associated files and cancels any in-progress downloads. This cannot be undone.

Model Schema

Full Model Object

{
  "id": "smollm2-360m",
  "version": "1.0.0",
  "kind": "llm",
  "readyToUse": true,
  "totalSize": 386000000,
  "createdAt": 1729591200000,
  "gguf": {
    "chatTemplateHint": "chatml",
    "initContextSize": 2048,
    "initGpuLayerSize": 99
  }
}

Field Descriptions

Field	Type	Description
`id`	string	Unique model identifier
`version`	string	Model version (metadata only, not part of unique key)
`kind`	string	Model type: `llm`, `vlm`, `embed`, `onnx`
`readyToUse`	boolean	Whether the model file is provisioned and ready
`totalSize`	integer	File size in bytes (set by system after upload/download)
`createdAt`	integer	Creation timestamp in milliseconds (set by system)
`gguf`	object	GGUF configuration (for llm, vlm, embed kinds)
`onnx`	object	ONNX configuration (for onnx kind)

ID Format Rules

Allowed characters: alphanumeric, dash, underscore, dot
Pattern: ^[a-zA-Z0-9._-]+$
Maximum length: 255 characters
Cannot start with . or -
Cannot contain ..

Example	Valid	Notes
`smollm2-360m`	Yes	Standard format
`llama-3.2-1b`	Yes	With dots
`my_model_v1`	Yes	With underscores
`../etc/passwd`	No	Path traversal blocked
`.hidden`	No	Cannot start with dot

Supported Chat Template Hints

For GGUF models (llm, vlm, embed), the chatTemplateHint field specifies the chat format:

Value	Models
`chatml`	Many fine-tuned models
`llama2`	Llama 2 family
`llama3`	Llama 3 family
`phi3`	Phi-3 family
`mistral-v1`, `mistral-v3`, `mistral-v7`	Mistral family
`gemma`	Gemma family
`deepseek`, `deepseek2`, `deepseek3`	DeepSeek family
`command-r`	Cohere Command-R
`falcon3`	Falcon 3
`zephyr`	Zephyr models
`vicuna`, `vicuna-orca`	Vicuna family
`openchat`	OpenChat models

Choosing Chat Template

Match the chatTemplateHint to your model. Using the wrong template may result in poor quality responses or formatting issues.

Error Responses

Code	Description
400	Bad request (invalid input parameter)
401	Unauthorized (missing authentication)
403	Forbidden (invalid API key)
404	Not found (model does not exist)
500	Internal server error

Error Format

{
  "error": {
    "code": 404,
    "message": "Model 'unknown-model' not found"
  }
}

Base URL​

Authentication​

Quick Reference​

Two-Step Provisioning​

Model Kinds​

Endpoints​

Create Model​

Upload Model File​

Download Model File​

List Models​

Get Model Details​

Update Model​

Delete Model​

Model Schema​

Full Model Object​

Field Descriptions​

ID Format Rules​

Supported Chat Template Hints​

Error Responses​

Related​

Base URL

Authentication

Quick Reference

Two-Step Provisioning

Model Kinds

Endpoints

Create Model

Upload Model File

Download Model File

List Models

Get Model Details

Update Model

Delete Model

Model Schema

Full Model Object

Field Descriptions

ID Format Rules

Supported Chat Template Hints

Error Responses

Related