Skip to main content

Model Registry API

The Model Registry API (mModelStore) provides model storage and management capabilities for the AI Foundation Package. This service allows you to register, upload, download, and manage AI models in GGUF and ONNX formats.

Base URL

http://localhost:8083/mimik-ai/store/v1
info

The Model Registry service runs as part of the AI Foundation Package. The default port is 8083.

Authentication

Mutating operations (POST, PUT, DELETE) require a Bearer token in the Authorization header:

Authorization: Bearer 1234

The default API key is 1234, configured in the [mmodelstore-v1] section of the addon .ini file. See Addon Configuration for details.

Read operations (GET) do not require authentication.

Quick Reference

MethodEndpointDescription
GET/modelsList all models
POST/modelsCreate model metadata
PUT/modelsUpdate model configuration
GET/models/{id}Get model details
DELETE/models/{id}Delete model
POST/models/{id}/uploadUpload model file
POST/models/{id}/downloadDownload model from URL

Two-Step Provisioning

Model provisioning follows a two-step process:

  1. Create metadata: Register the model with its configuration
  2. Provision file: Either upload directly or download from URL

This separation allows you to configure the model before the file transfer, and supports both local uploads and remote downloads.


Model Kinds

The Model Registry supports four model kinds:

KindDescriptionFile FormatUse Case
llmLarge Language ModelGGUFText generation, chat, reasoning
vlmVision Language ModelGGUF + mmprojMultimodal (text + images)
embedEmbedding ModelGGUFText embeddings, semantic search
onnxONNX ModelONNXImage classification, predictive AI

Endpoints

Create Model

Create a new model entry with metadata. The model file is provisioned separately.

Request

POST /models

Headers

HeaderRequiredValue
Content-TypeYesapplication/json
AuthorizationYesBearer <token>

Request Body

FieldTypeRequiredDescription
idstringYesUnique model identifier
versionstringYesModel version (metadata only)
kindstringYesModel type: llm, vlm, embed, onnx
ggufobjectNoGGUF configuration (for llm, vlm, embed)
onnxobjectNoONNX configuration (for onnx kind)

GGUF Configuration

FieldTypeDescription
chatTemplateHintstringChat template format (see supported values below)
initContextSizeintegerContext window size for model initialization
initGpuLayerSizeintegerGPU layers to offload during initialization

ONNX Configuration

FieldTypeDescription
executionProviderstringExecution provider: cpu, cuda, coreml, tensorrt

Example: Create LLM

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 1234" \
-d '{
"id": "smollm2-360m",
"version": "1.0.0",
"kind": "llm",
"gguf": {
"chatTemplateHint": "chatml",
"initContextSize": 2048,
"initGpuLayerSize": 99
}
}'

Response (201 Created)

{
"id": "smollm2-360m",
"version": "1.0.0",
"kind": "llm",
"readyToUse": false,
"createdAt": 1729591200000,
"gguf": {
"chatTemplateHint": "chatml",
"initContextSize": 2048,
"initGpuLayerSize": 99
}
}
Idempotent Operation

Creating a model that already exists updates the metadata (returns 200 OK instead of 201 Created).


Upload Model File

Upload a model file directly via multipart form data. This is Step 2a of provisioning.

Request

POST /models/{id}/upload

Headers

HeaderRequiredValue
Content-TypeYesmultipart/form-data
AuthorizationYesBearer <token>

Form Data

FieldTypeRequiredDescription
filebinaryYesThe model file (.gguf or .onnx)

Example

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m/upload" \
-H "Authorization: Bearer 1234" \
-F "file=@SmolLM2-360M-Instruct-Q8_0.gguf"

Response (200 OK)

{
"id": "smollm2-360m",
"version": "1.0.0",
"kind": "llm",
"readyToUse": true,
"totalSize": 386000000,
"createdAt": 1729591200000,
"gguf": {
"chatTemplateHint": "chatml",
"initContextSize": 2048,
"initGpuLayerSize": 99
}
}

Download Model File

Download a model file from a URL. This is Step 2b of provisioning. Returns a Server-Sent Events stream with download progress.

Request

POST /models/{id}/download

Headers

HeaderRequiredValue
Content-TypeYesapplication/json
AuthorizationYesBearer <token>
AcceptNotext/event-stream (for SSE progress)

Request Body

FieldTypeRequiredDescription
urlstringYesURL to download the model file
mmprojUrlstringNoURL for VLM multimodal projection file

Example: Download LLM

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m/download" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 1234" \
-d '{
"url": "https://huggingface.co/lmstudio-community/SmolLM2-360M-Instruct-GGUF/resolve/main/SmolLM2-360M-Instruct-Q8_0.gguf?download=true"
}'

SSE Response Stream

data: {"size": 100000000, "totalSize": 386000000}

data: {"size": 250000000, "totalSize": 386000000}

data: {"size": 386000000, "totalSize": 386000000}

data: {"done": true, "model": {"id": "smollm2-360m", "readyToUse": true, ...}}

Example: Download VLM with mmproj

curl -X POST "http://localhost:8083/mimik-ai/store/v1/models/llava-1.6/download" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 1234" \
-d '{
"url": "https://example.com/llava-model.gguf",
"mmprojUrl": "https://example.com/llava-mmproj.gguf"
}'

VLM downloads include mmproj progress:

data: {"size": 1048576000, "totalSize": 1048576000}
data: {"mmproj": {"size": 26214400, "totalSize": 52428800}}
data: {"done": true, "model": {...}}
Cancel Download

To cancel a download in progress, disconnect from the SSE stream. The server detects disconnection, stops the download, and removes partial files.


List Models

Retrieve all models in the registry.

Request

GET /models

Query Parameters

ParameterTypeDescription
kindstringFilter by model type: llm, vlm, embed, onnx
readybooleanFilter by ready status: true or false

Example

# List all models
curl "http://localhost:8083/mimik-ai/store/v1/models"

# List only LLM models
curl "http://localhost:8083/mimik-ai/store/v1/models?kind=llm"

# List only ready models
curl "http://localhost:8083/mimik-ai/store/v1/models?ready=true"

# Combine filters
curl "http://localhost:8083/mimik-ai/store/v1/models?kind=llm&ready=true"

Response (200 OK)

{
"data": [
{
"id": "smollm2-360m",
"version": "1.0.0",
"kind": "llm",
"readyToUse": true,
"totalSize": 386000000,
"createdAt": 1729591200000,
"gguf": {
"chatTemplateHint": "chatml",
"initContextSize": 2048,
"initGpuLayerSize": 99
}
},
{
"id": "nomic-embed-text",
"version": "1.0.0",
"kind": "embed",
"readyToUse": true,
"totalSize": 274000000,
"createdAt": 1729591300000,
"gguf": {
"initContextSize": 8192
}
}
]
}

Get Model Details

Retrieve detailed information about a specific model.

Request

GET /models/{id}

Query Parameters

ParameterValueDescription
altmediaDownload the model file instead of metadata

Example

# Get model metadata
curl "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m"

# Download model file
curl "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m?alt=media" -o model.gguf

Response (200 OK)

{
"id": "smollm2-360m",
"version": "1.0.0",
"kind": "llm",
"readyToUse": true,
"totalSize": 386000000,
"createdAt": 1729591200000,
"gguf": {
"chatTemplateHint": "chatml",
"initContextSize": 2048,
"initGpuLayerSize": 99
}
}

Update Model

Update model configuration (metadata fields only).

Request

PUT /models

Headers

HeaderRequiredValue
Content-TypeYesapplication/json
AuthorizationYesBearer <token>

Request Body

FieldTypeRequiredDescription
idstringYesModel identifier to update
actionstringYesMust be "update"
ggufobjectNoUpdated GGUF configuration
onnxobjectNoUpdated ONNX configuration

Example

curl -X PUT "http://localhost:8083/mimik-ai/store/v1/models" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 1234" \
-d '{
"id": "smollm2-360m",
"action": "update",
"gguf": {
"initContextSize": 4096,
"initGpuLayerSize": 99
}
}'

Delete Model

Remove a model from the registry. This deletes the model entry and associated files.

Request

DELETE /models/{id}

Headers

HeaderRequiredValue
AuthorizationYesBearer <token>

Example

curl -X DELETE "http://localhost:8083/mimik-ai/store/v1/models/smollm2-360m" \
-H "Authorization: Bearer 1234"
Permanent Deletion

Deleting a model removes all associated files and cancels any in-progress downloads. This cannot be undone.


Model Schema

Full Model Object

{
"id": "smollm2-360m",
"version": "1.0.0",
"kind": "llm",
"readyToUse": true,
"totalSize": 386000000,
"createdAt": 1729591200000,
"gguf": {
"chatTemplateHint": "chatml",
"initContextSize": 2048,
"initGpuLayerSize": 99
}
}

Field Descriptions

FieldTypeDescription
idstringUnique model identifier
versionstringModel version (metadata only, not part of unique key)
kindstringModel type: llm, vlm, embed, onnx
readyToUsebooleanWhether the model file is provisioned and ready
totalSizeintegerFile size in bytes (set by system after upload/download)
createdAtintegerCreation timestamp in milliseconds (set by system)
ggufobjectGGUF configuration (for llm, vlm, embed kinds)
onnxobjectONNX configuration (for onnx kind)

ID Format Rules

  • Allowed characters: alphanumeric, dash, underscore, dot
  • Pattern: ^[a-zA-Z0-9._-]+$
  • Maximum length: 255 characters
  • Cannot start with . or -
  • Cannot contain ..
ExampleValidNotes
smollm2-360mYesStandard format
llama-3.2-1bYesWith dots
my_model_v1YesWith underscores
../etc/passwdNoPath traversal blocked
.hiddenNoCannot start with dot

Supported Chat Template Hints

For GGUF models (llm, vlm, embed), the chatTemplateHint field specifies the chat format:

ValueModels
chatmlMany fine-tuned models
llama2Llama 2 family
llama3Llama 3 family
phi3Phi-3 family
mistral-v1, mistral-v3, mistral-v7Mistral family
gemmaGemma family
deepseek, deepseek2, deepseek3DeepSeek family
command-rCohere Command-R
falcon3Falcon 3
zephyrZephyr models
vicuna, vicuna-orcaVicuna family
openchatOpenChat models
Choosing Chat Template

Match the chatTemplateHint to your model. Using the wrong template may result in poor quality responses or formatting issues.


Error Responses

CodeDescription
400Bad request (invalid input parameter)
401Unauthorized (missing authentication)
403Forbidden (invalid API key)
404Not found (model does not exist)
500Internal server error

Error Format

{
"error": {
"code": 404,
"message": "Model 'unknown-model' not found"
}
}