LLiMa CLI

Use the llima CLI on Modalix to manage precompiled models and do simple runtime testing. It is useful for checking that a model loads, accepts prompts, and produces output before you integrate it with Neat Framework direct APIs or the Neat GenAI server endpoints.

Model Manager

LLiMa includes a model manager through the llima CLI. It lets you search, download, list, remove, and run precompiled models directly from the command line. Models are stored under /media/nvme/llima/models by default. Set LLIMA_MODELS_PATH to use a different models directory.

Browse available models:

modalix:~$ llima search
modalix:~$ llima search qwen

Download a model by name, without the simaai/ organization prefix:

modalix:~$ llima pull Qwen3-VL-4B-Instruct-GPTQ-a16w4

List and remove locally installed models:

modalix:~$ llima list
modalix:~$ llima rm Qwen3-VL-4B-Instruct-GPTQ-a16w4

Running LLiMa

Use llima run as a simple runtime for initial model validation on Modalix.

modalix:~$ llima run <model> [options]

Argument	Description
`model`	Model ID or path (e.g., `Qwen3-VL-8B-Instruct-a16w4`).
`--stt_model_path`	Path to the elf files for a Speech-to-Text model (optional).

For all available options, run llima run -h.

Examples

modalix:~$ llima run Qwen3-VL-4B-Instruct-a16w4

Interactive Commands

Once llima run starts in CLI mode, use these commands at the prompt:

Command	Description
`add image <file>`	Add an image to the current prompt context.
`clear image`	Clear all images.
`set system <prompt>`	Set the system prompt.
`clear system`	Clear the system prompt, chat history, and images.
`clear history`	Clear chat history and images.
`print history`	Print chat history.
`set audio <file>`	Set the audio file to transcribe as the query.
`set language <lang>`	Set the language string used for transcription.
`set lora <name>`	Use LoRA weights from a `npy_files` folder.
`unset lora`	Revert the LoRA model to the baseline model.
`quit`	Quit.
`list command`	Print available commands.
`help`	Print available commands.

Model Manager​

Running LLiMa​

Interactive Commands​

Model Manager

Running LLiMa

Interactive Commands