Using Ollama From the Command Line

Working with Ollama from the command line provides a straightforward and efficient way to interact with large language models locally. The basic command structure starts with ollama run modelname, where modelname could be models like ’llama3’, ‘mistral’, or ‘codellama’. You can enhance your prompts using the -f flag for system prompts or context files, and the —verbose flag to see token usage and generation metrics. For example, ollama run llama2 -f system_prompt.txt “Your question here” lets you provide consistent context across multiple interactions.

One powerful technique is using Ollama’s model tags to maintain different versions or configurations of the same base model. For any model on the Ollama web site, you can view all available model tags, for example: https://ollama.com/library/llama2/tags.

The ollama list command helps you track installed models, and ollama rm modelname keeps your system clean. For development work, the —format json flag outputs responses in JSON format, making it easier to parse in scripts or applications; for example:

Using JSON Format

 1 $ ollama run qwq:latest --format json
 2 >>> What are the capitals of Germany and France?
 3 { 
 4   "Germany": {
 5     "Capital": "Berlin",
 6     "Population": "83.2 million",
 7     "Area": "137,847 square miles"
 8   },
 9   "France": {
10     "Capital": "Paris",
11     "Population": "67.4 million",
12     "Area": "248,573 square miles"
13   }
14 }
15 
16 >>> /bye

Analysis of Images

Advanced users can leverage Ollama’s multimodal capabilities and streaming options. For models like llava, you can pipe in image files using standard input or file paths. For example:

1 $ ollama run llava:7b "Describe this image" markcarol.jpg
2  The image is a photograph featuring a man and a woman looking 
3 off-camera, towards the left side of the frame. In the background, there are indistinct objects that give the impression of an outdoor setting, possibly on a patio or deck.
4 
5 The focus and composition suggest that the photo was taken during the day in natural light. 

While I only cover command line use in this one short chapter, I use Ollama in command line mode several hours a week for software development, usually using a Qwen coding LLM:

1 $ ollama run qwen2.5-coder:14b
2 >>> Send a message (/? for help)

I find that the qwen2.5-coder:14b model performs well for my most often used programming languages: Python, Common Lisp, Racket Scheme, and Haskell.

I also enjoy experimenting with the QwQ reasoning model even though it is so large it barely runs on my 32G M2 Pro system:

1 $ ollama run qwq:latest       
2 >>>

Analysis of Source Code Files

Here, assuming we are in the main directory for the GitHub repository for this book, we can ask for analysis of the tool for using SQLite databases(most output is not shown):

 1 $ ollama run qwen2.5-coder:14b < tool_sqlite.py 
 2 This code defines a Python application that interacts with an SQLite database using SQL queries 
 3 generated by the Ollama language model. The application is structured around two main classes:
 4 
 5 1. **SQLiteTool**: Manages interactions with an SQLite database.
 6    - Handles creating sample data, managing database connections, and executing SQL queries.
 7    - Provides methods to list tables in the database, get table schemas, and execute arbitrary SQL 
 8 queries.
 9 
10 2. **OllamaFunctionCaller**: Acts as a bridge between user inputs and the SQLite database through the 
11 Ollama model.
12    - Defines functions that can be called by the Ollama model (e.g., querying the database or listing 
13 tables).
14    - Generates prompts for the Ollama model based on user input, parses the response to identify which 
15 function should be executed, and then calls the appropriate method in `SQLiteTool`.
16 
17 ...

Unfortunately, when using the command ollama run qwen2.5-coder:14b < tool_sqlite.py, Ollama processes the input from the file and then exits the REPL. There’s no built-in way to stay in the Ollama REPL. However, if you want to analyze code and then interactively chat about the code, ask for code modifications, etc., you can try:

  • Start Ollama:
  • Paste the source code to tool_sqlite.py into Ollama REPL
  • Ask for advice, for example: “Please add code to print out the number of input and output tokens that are used by Ollama when calling function_caller.process_request(query)”