Using the LM Studio Command Line Interface (CLI)

While the LM Studio UI application is convenient to use for chatting, using LM Studio as a RAG system, etc., the command line interface is also useful because a command line interface (CLI) is often a much faster way to get work done.

You can refer to the official documentation https://lmstudio.ai/docs/cli. Here we will look at a few examples:

lms ls

 1 $ lms ls
 2 
 3 You have 6 models, taking up 42.23 GB of disk space.
 4 
 5 LLMs (Large Language Models)        PARAMS ARCHITECTURE  SIZE
 6 qwen3moe                                                 13.29 GB
 7 qwen3-30b-a3b-instruct-2507-mlx            qwen3_moe     17.19 GB
 8 qwen/qwen3-30b-a3b-2507                    qwen3_moe     17.19 GB
 9 google/gemma-3n-e4b                        gemma3n        5.86 GB  ✓ LOADED
10 liquid/lfm2-1.2b                           lfm2           1.25 GB
11 
12 Embedding Models                   PARAMS      ARCHITECTURE          SIZE
13 text-embedding-nomic-embed-text-v1.5           Nomic BERT        84.11 MB

lms load <model_key>

A model key is the first item displayed on an output line when you run llm ls.

1 $ lms load google/gemma-3n-e4b 
2 
3 Loading model "google/gemma-3n-e4b"...
4 Model loaded successfully in 13.59s. (5.86 GB)
5 To use the model in the API/SDK, use the identifier "google/gemma-3n-e4b:2".
6 To set a custom identifier, use the --identifier <identifier> option.

lms unload

lms unload takes an optional <model_key>. If you don’t specify a model key then you will be shown a list of loaded models and you can interactively unload models:

1 $ lms unload
2 
3 ! Use the arrow keys to navigate, type to filter, and press enter to select.
4 ! To unload all models, use the --all flag.
5 
6 ? Select a model to unload | Type to filter...
7    qwen3-30b-a3b-instruct-2507-mlx
8 ❯  google/gemma-3n-e4b

lms get

lms get supports searching for models on Huggingface by name and interactively downloading them. Here is an example:

 1 $ lms get llama-3.2 --mlx --gguf --limit 6
 2 Searching for models with the term llama-3.2
 3 No exact match found. Please choose a model from the list below.
 4 
 5 ! Use the arrow keys to navigate, and press enter to select.
 6 
 7 ? Select a model to download (Use arrow keys)
 8 ❯ [Staff Pick] Hermes 3 Llama 3.2 3B 
 9   [Staff Pick] Llama 3.2 1B Instruct 4bit 
10   [Staff Pick] Llama 3.2 3B Instruct 4bit 
11   [Staff Pick] Llama 3.2 1B 
12   [Staff Pick] Llama 3.2 3B 
13   DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Server Status and Control

 1 $ lms server status
 2 The server is running on port 1234.
 3 Marks-Mac-mini:api_introduction $ lms server stop
 4 Stopped the server on port 1234.
 5 Marks-Mac-mini:api_introduction $ lms server start
 6 Starting server...
 7 Success! Server is now running on port 1234
 8 Marks-Mac-mini:api_introduction $ lms ps
 9 
10    LOADED MODELS   
11 
12 Identifier: google/gemma-3n-e4b
13   • Type:  LLM 
14   • Path: google/gemma-3n-e4b
15   • Size: 5.86 GB
16   • Architecture: gemma3n

Up next

Introduction to LM Studio’s Local Inference API