Table of Contents
- Preface
- Setting Up Your Computing Environment for Using Ollama and Using Book Example Programs
- Using Ollama From the Command Line
- Short Examples
- LLM Tool Calling with Ollama
- Automatic Evaluation of LLM Results: More Tool Examples
- Building Agents with Ollama and the Hugging Face Smolagents Library
- Using AG2 Open-Source AgentOS LLM-Based Agent Framework for Generating and Executing Python Code
- Building Agents with Ollama and the Agno Agent Library
- Using the Unsloth Library on Google Colab to FineTune Models for Ollama
- Reasoning with Large Language Models
- Using Property Graph Database with Ollama
- Using the Open Codex Command Line Interface Coding Agent
- Long Term Persistence Using Mem0 and Chroma
- Book Wrap Up
Preface
Ollama is an open-source framework that enables users to run large language models (LLMs) locally on their computers, facilitating tasks like text summarization, chatbot development, and more. It supports various models, including Llama 4, Mistral, and Gemma, and offers flexibility in model sizes and quantization options to balance performance and resource usage. Ollama provides a command-line interface and an HTTP API for seamless integration into applications, making advanced AI capabilities accessible without relying on cloud services. Ollama is available on macOS, Linux, and Windows.
A main theme of this book are the advantages of running models privately on either your personal computer or a computer at work. While many commercial LLM API venders have options to not reuse your prompt data and the output generated from your prompts to train their systems, there is no better privacy and security than running open weight models on your own hardware.
This book is about running Large Language Models (LLMs) on your own hardware using Ollama. We will be using both the Ollama Python SDK library’s native support for passing text and images to LLMs as well as Ollama’s OpenAI API compatibility layer that lets you take any of the projects you may already run using OpenAI’s APIs and port them easily to run locally on Ollama.
To be clear, dear reader, although I have a strong preference to running smaller LLMs on my own hardware, I also frequently use commercial LLM API vendors like Anthropic, OpenAI, ABACUS.AI, GROQ, and Google to take advantage of features like advanced models and scalability using cloud-based hardware.
About the Author
I am an AI practitioner and consultant specializing in large language models, LangChain/Llama-Index integrations, deep learning, and the semantic web. I have authored over 20 books on topics including artificial intelligence, Python, Common Lisp, deep learning, Haskell, Clojure, Java, Ruby, the Hy language, and the semantic web. I have 55 U.S. patents. Please check out my home page and social media: my personal web site https://markwatson.com, X/Twitter, my Blog on Blogspot, and my Blog on Substack
Requests from the Author
This book will always be available to read free online at https://leanpub.com/ollama/read.
That said, I appreciate it when readers purchase my books because the income enables me to spend more time writing.
Hire the Author as a Consultant
I am available for short consulting projects. Please see https://markwatson.com.
Why Should We Care About Privacy?
Running local models using tools like Ollama can enhance privacy when dealing with sensitive data. Let’s delve into why privacy is crucial and how Ollama contributes to improved security.
Why is privacy important?
Privacy is paramount for several reasons:
- Protection from Data Breaches: When data is processed by third-party services, it becomes vulnerable to potential data breaches. Storing and processing data locally minimizes this risk significantly. This is especially critical for sensitive information like personal details, financial records, or proprietary business data.
- Compliance with Regulations: Many industries are subject to stringent data privacy regulations, such as GDPR, HIPAA, and CCPA. Running models locally can help organizations maintain compliance by ensuring data remains under their control.
- Maintaining Confidentiality: For certain applications, like handling legal documents or medical records, maintaining confidentiality is of utmost importance. Local processing ensures that sensitive data isn’t exposed to external parties.
- Data Ownership and Control: Individuals and organizations have a right to control their own data. Local models empower users to maintain ownership and make informed decisions about how their data is used and shared.
- Preventing Misuse: By keeping data local, you reduce the risk of it being misused by third parties for unintended purposes, such as targeted advertising, profiling, or even malicious activities.
Security Improvements with Ollama
Ollama, as a tool for running large language models (LLMs) locally, offers several security advantages:
- Data Stays Local: Ollama allows you to run models on your own hardware, meaning your data never leaves your local environment. This eliminates the need to send data to external servers for processing.
- Reduced Attack Surface: By avoiding external communication for model inference, you significantly reduce the potential attack surface for malicious actors. There’s no need to worry about vulnerabilities in third-party APIs or network security.
- Control over Model Access: With Ollama, you have complete control over who has access to your models and data. This is crucial for preventing unauthorized access and ensuring data security.
- Transparency and Auditability: Running models locally provides greater transparency into the processing pipeline. You can monitor and audit the model’s behavior more easily, ensuring it operates as intended.
- Customization and Flexibility: Ollama allows you to customize your local environment and security settings according to your specific needs. This level of control is often not possible with cloud-based solutions.
It’s important to note that while Ollama enhances privacy and security, it’s still crucial to follow general security best practices for your local environment. This includes keeping your operating system and software updated, using strong passwords, and implementing appropriate firewall rules.
Setting Up Your Computing Environment for Using Ollama and Using Book Example Programs
There is a GitHub repository that I have prepared for you, dear reader, to both support working through the examples for this book as well as hopefully provide utilities for your own projects.
You need to git clone the following repository:
https://github.com/mark-watson/Ollama_in_Action_Book that contains tools I have written in Python that you can use with Ollama as well as utilities I wrote to avoid repeated code in the book examples. There are also application level example files that have the string “example” in the file names. Tool library files begin with “tool” and files starting with “Agent” contain one of several approaches to writing Agents.
Note: the source code repository changed August 10, 2025. If you cloned the old repo please archive it and re-clone https://github.com/mark-watson/Ollama_in_Action_Book
Note: Starting August 10, 2025 the GitHub Repo https://github.com/mark-watson/Ollama_in_Action_Book now contains the book’s manuscript files as well as the source code for the examples.
Python Build Tools
The requirements.txt file contains the library requirements for all code developed in this book. My preference is to use venv and maintain a separate Python environment for each of the few hundred Python projects I have on my laptop. I keep a personal directory ~/bin on my PATH and I use the following script venv_setup.sh in the ~/bin directory to use a requirements.txt file to set up a virtual environment:
1 #!/bin/zsh
2
3 # Check if the directory has a requirements.txt file
4 if [ ! -f "requirements.txt" ]; then
5 echo "No requirements.txt file found in the current directory."
6 exit 1
7 fi
8
9 # Create a virtual environment in the venv directory
10 python3 -m venv venv
11
12 # Activate the virtual environment
13 source venv/bin/activate
14
15 # Upgrade pip to the latest version
16 pip3 install --upgrade pip
17
18 # Install dependencies from requirements.txt
19 pip3 install -r requirements.txt
20
21 # Display installed packages
22 pip3 list
23
24 echo "Virtual environment setup complete. Reactivate it with:"
25 echo "source venv/bin/activate"
26 echo ""
I sometimes like to use the much faster uv build and package management tool:
1 uv venv
2 source .venv/bin/activate
3 uv pip install -r requirements.txt
4 uv run ollama_tools_examples.py
There are many other good options like Anaconda, miniconda, poetry, etc.
Using Ollama From the Command Line
Working with Ollama from the command line provides a straightforward and efficient way to interact with large language models locally. The basic command structure starts with ollama run modelname, where modelname could be models like ’llama3’, ‘mistral’, or ‘codellama’. You can enhance your prompts using the -f flag for system prompts or context files, and the —verbose flag to see token usage and generation metrics. For example, ollama run llama2 -f system_prompt.txt “Your question here” lets you provide consistent context across multiple interactions.
One powerful technique is using Ollama’s model tags to maintain different versions or configurations of the same base model. For any model on the Ollama web site, you can view all available model tags, for example: https://ollama.com/library/llama2/tags.
The ollama list command helps you track installed models, and ollama rm modelname keeps your system clean. For development work, the —format json flag outputs responses in JSON format, making it easier to parse in scripts or applications; for example:
Using JSON Format
1 $ ollama run qwq:latest --format json
2 >>> What are the capitals of Germany and France?
3 {
4 "Germany": {
5 "Capital": "Berlin",
6 "Population": "83.2 million",
7 "Area": "137,847 square miles"
8 },
9 "France": {
10 "Capital": "Paris",
11 "Population": "67.4 million",
12 "Area": "248,573 square miles"
13 }
14 }
15
16 >>> /bye
Analysis of Images
Advanced users can leverage Ollama’s multimodal capabilities and streaming options. For models like llava, you can pipe in image files using standard input or file paths. For example:
1 $ ollama run llava:7b "Describe this image" markcarol.jpg
2 The image is a photograph featuring a man and a woman looking
3 off-camera, towards the left side of the frame. In the background, there are indistinct objects that give the impression of an outdoor setting, possibly on a patio or deck.
4
5 The focus and composition suggest that the photo was taken during the day in natural light.
While I only cover command line use in this one short chapter, I use Ollama in command line mode several hours a week for software development, usually using a Qwen coding LLM:
1 $ ollama run qwen2.5-coder:14b
2 >>> Send a message (/? for help)
I find that the qwen2.5-coder:14b model performs well for my most often used programming languages: Python, Common Lisp, Racket Scheme, and Haskell.
I also enjoy experimenting with the QwQ reasoning model even though it is so large it barely runs on my 32G M2 Pro system:
1 $ ollama run qwq:latest
2 >>>
Analysis of Source Code Files
Here, assuming we are in the main directory for the GitHub repository for this book, we can ask for analysis of the tool for using SQLite databases(most output is not shown):
1 $ ollama run qwen2.5-coder:14b < tool_sqlite.py
2 This code defines a Python application that interacts with an SQLite database using SQL queries
3 generated by the Ollama language model. The application is structured around two main classes:
4
5 1. **SQLiteTool**: Manages interactions with an SQLite database.
6 - Handles creating sample data, managing database connections, and executing SQL queries.
7 - Provides methods to list tables in the database, get table schemas, and execute arbitrary SQL
8 queries.
9
10 2. **OllamaFunctionCaller**: Acts as a bridge between user inputs and the SQLite database through the
11 Ollama model.
12 - Defines functions that can be called by the Ollama model (e.g., querying the database or listing
13 tables).
14 - Generates prompts for the Ollama model based on user input, parses the response to identify which
15 function should be executed, and then calls the appropriate method in `SQLiteTool`.
16
17 ...
Unfortunately, when using the command ollama run qwen2.5-coder:14b < tool_sqlite.py, Ollama processes the input from the file and then exits the REPL. There’s no built-in way to stay in the Ollama REPL. However, if you want to analyze code and then interactively chat about the code, ask for code modifications, etc., you can try:
- Start Ollama:
- Paste the source code to tool_sqlite.py into Ollama REPL
- Ask for advice, for example: “Please add code to print out the number of input and output tokens that are used by Ollama when calling function_caller.process_request(query)”
Short Examples
Here we look at a few short examples before later using libraries we develop and longer application style example programs with Ollama to solve more difficult problems.
Using The Ollama Python SDK with Image and Text Prompts
We saw an example of image processing in the last chapter using Ollama command line mode. Here we do the same thing using a short Python script that you can find in the file short_programs/Ollama_sdk_image_example.py:
1 import ollama
2 import base64
3
4 def analyze_image(image_path: str, prompt: str) -> str:
5 # Read and encode the image
6 with open(image_path, 'rb') as img_file:
7 image_data = base64.b64encode(img_file.read()).decode('utf-8')
8
9 try:
10 # Create a stream of responses using the Ollama SDK
11 stream = ollama.generate(
12 model='llava:7b',
13 prompt=prompt,
14 images=[image_data],
15 stream=True
16 )
17
18 # Accumulate the response
19 full_response = ""
20 for chunk in stream:
21 if 'response' in chunk:
22 full_response += chunk['response']
23
24 return full_response
25
26 except Exception as e:
27 return f"Error processing image: {str(e)}"
28
29 def main():
30 image_path = "data/sample.jpg"
31 prompt = "Please describe this image in detail, focusing on the actions of people in the picture."
32
33 result = analyze_image(image_path, prompt)
34 print("Analysis Result:")
35 print(result)
36
37 if __name__ == "__main__":
38 main()
The output may look like the following when you run this example:
1 Analysis Result:
2 The image appears to be a photograph taken inside a room that serves as a meeting or gaming space and capturing an indoor scene where five individuals are engaged in playing a tabletop card game. In the foreground, there is a table with a green surface and multiple items on it, including what looks like playing cards spread out in front of the people seated around it.
3
4 The room has a comfortable and homely feel, with elements like a potted plant in the background on the left, which suggests that this might be a living room or a similar space repurposed for a group activity.
Using the OpenAI Compatibility APIs with Local Models Running on Ollama
If you frequently use the OpenAI APIs for either your own LLM projects or work projects, you might want to simply use the same SDK library from OpenAI but specify a local Ollama REST endpoint:
1 import openai
2 from typing import List, Dict
3
4 class OllamaClient:
5 def __init__(self, base_url: str = "http://localhost:11434/v1"):
6 self.client = openai.OpenAI(
7 base_url=base_url,
8 api_key="fake-key" # Ollama doesn't require authentication locally
9 )
10
11 def chat_with_context(
12 self,
13 system_context: str,
14 user_prompt: str,
15 model: str = "llama3.2:latest",
16 temperature: float = 0.7
17 ) -> str:
18 try:
19 messages = [
20 {"role": "system", "content": system_context},
21 {"role": "user", "content": user_prompt}
22 ]
23
24 response = self.client.chat.completions.create(
25 model=model,
26 messages=messages,
27 temperature=temperature,
28 stream=False
29 )
30
31 return response.choices[0].message.content
32
33 except Exception as e:
34 return f"Error: {str(e)}"
35
36 def chat_conversation(
37 self,
38 messages: List[Dict[str, str]],
39 model: str = "llama2"
40 ) -> str:
41 try:
42 response = self.client.chat.completions.create(
43 model=model,
44 messages=messages,
45 stream=False
46 )
47
48 return response.choices[0].message.content
49
50 except Exception as e:
51 return f"Error: {str(e)}"
52
53 def main():
54 # Initialize the client
55 client = OllamaClient()
56
57 # Example 1: Single interaction with context
58 system_context = """You are a helpful AI assistant with expertise in
59 programming and technology. Provide clear, concise answers."""
60
61 user_prompt = "Explain the concept of recursion in programming."
62
63 response = client.chat_with_context(
64 system_context=system_context,
65 user_prompt=user_prompt,
66 model="llama3.2:latest",
67 temperature=0.7
68 )
69
70 print("Response with context:")
71 print(response)
72 print("\n" + "="*50 + "\n")
73
74 # Example 2: Multi-turn conversation
75 conversation = [
76 {"role": "system", "content": "You are a helpful AI assistant."},
77 {"role": "user", "content": "What is machine learning?"},
78 {"role": "assistant", "content": "Machine learning is a subset of AI that enables systems to learn from data."},
79 {"role": "user", "content": "Can you give me a simple example?"}
80 ]
81
82 response = client.chat_conversation(
83 messages=conversation,
84 model="llama3.2:latest"
85 )
86
87 print("Conversation response:")
88 print(response)
89
90 if __name__ == "__main__":
91 main()
The output might look like (following listing is edited for brevity):
1 Response with context:
2 Recursion is a fundamental concept in programming that allows a function or method to call itself repeatedly until it reaches a base case that stops the recursion.
3
4 **What is Recursion?**
5
6 In simple terms, recursion is a programming technique where a function invokes itself as a sub-procedure, repeating the same steps until it solves a problem ...
7
8 **Key Characteristics of Recursion:**
9
10 1. **Base case**: A trivial case that stops the recursion.
11 2. **Recursive call**: The function calls itself with new input or parameters.
12 3. **Termination condition**: When the base case is reached, the recursion terminates.
13
14 **How Recursion Works:**
15
16 Here's an example to illustrate recursion:
17
18 Imagine you have a recursive function `factorial(n)` that calculates the factorial of a number `n`. The function works as follows:
19
20 1. If `n` is 0 or 1 (base case), return 1.
21 2. Otherwise, call itself with `n-1` as input and multiply the result by `n`.
22 3. Repeat step 2 until `n` reaches 0 or 1.
23
24 Here's a simple recursive implementation in Python ...
25
26 **Benefits of Recursion:**
27
28 Recursion offers several advantages:
29
30 * **Divide and Conquer**: Break down complex problems into smaller, more manageable sub-problems.
31 * **Elegant Code**: Recursive solutions can be concise and easy to understand.
32 * **Efficient**: Recursion can avoid explicit loops and reduce memory usage.
33 ...
34
35 In summary, recursion is a powerful technique that allows functions to call themselves repeatedly until they solve a problem. By understanding the basics of recursion and its applications, you can write more efficient and elegant code for complex problems.
36
37 ==================================================
38
39 Conversation response:
40 A simple example of machine learning is a spam filter.
41
42 Imagine we have a system that scans emails and identifies whether they are spam or not. The system learns to classify these emails as spam or not based on the following steps:
43
44 1. Initially, it receives a large number of labeled data points (e.g., 1000 emails), where some emails are marked as "spam" and others as "not spam".
45 2. The system analyzes these examples to identify patterns and features that distinguish spam emails from non-spam messages.
46 3. Once the patterns are identified, the system can use them to classify new, unseen email data (e.g., a new email) as either spam or not spam.
47
48 Over time, the system becomes increasingly accurate in its classification because it has learned from the examples and improvements have been made. This is essentially an example of supervised machine learning, where the system learns by being trained on labeled data.
In the next chapter we start developing tools that can be used for “function calling” with Ollama.
LLM Tool Calling with Ollama
There are several example Python tool utilities in the GitHub repository https://github.com/mark-watson/Ollama_in_Action_Book in the source-code directory that we will use for function calling that start with the “tool” prefix:
1 https://github.com/mark-watson/Ollama_in_Action_Book/source-code $ ls tool*
2 tool_file_contents.py tool_llm_eval.py tool_web_search.py
3 tool_file_dir.py tool_sqlite.py
4 tool_judge_results.py tool_summarize_text.py
We postpone using the tools tool_llm_eval.py and tool_judge_results.py until the next chapter Automatic Evaluation of LLM Results
If you have not yet done so, please clone the repository for my Ollama book examples using:
1 git clone https://github.com/mark-watson/Ollama_in_Action_Book.git
Use of Python docstrings at runtime:
The Ollama Python SDK leverages docstrings as a crucial part of its runtime function calling mechanism. When defining functions that will be called by the LLM, the docstrings serve as structured metadata that gets parsed and converted into a JSON schema format. This schema describes the function’s parameters, their types, and expected behavior, which is then used by the model to understand how to properly invoke the function. The docstrings follow a specific format that includes parameter descriptions, type hints, and return value specifications, allowing the SDK to automatically generate the necessary function signatures that the LLM can understand and work with.
During runtime execution, when the LLM determines it needs to call a function, it first reads these docstring-derived schemas to understand the function’s interface. The SDK parses these docstrings using Python’s introspection capabilities (through the inspect module) and matches the LLM’s intended function call with the appropriate implementation. This system allows for a clean separation between the function’s implementation and its interface description, while maintaining human-readable documentation that serves as both API documentation and runtime function calling specifications. The docstring parsing is done lazily at runtime when the function is first accessed, and the resulting schema is typically cached to improve performance in subsequent calls.
Example Showing the Use of Tools Developed Later in this Chapter
The source file ollama_tools_examples.py contains simple examples of using the tools we develop later in this chapter. We will look at example code using the tools, then at the implementation of the tools. In this examples source file we first import these tools:
1 from tool_file_dir import list_directory
2 from tool_file_contents import read_file_contents
3 from tool_web_search import uri_to_markdown
4
5 import ollama
6
7 # Map function names to function objects
8 available_functions = {
9 'list_directory': list_directory,
10 'read_file_contents': read_file_contents,
11 'uri_to_markdown': uri_to_markdown,
12 }
13
14 # User prompt
15 user_prompt = "Please list the contents of the current directory, read the 'requirements.txt' file, and convert 'https://markwatson.com' to markdown."
16
17 # Initiate chat with the model
18 response = ollama.chat(
19 model='llama3.1',
20 messages=[{'role': 'user', 'content': user_prompt}],
21 tools=[list_directory, read_file_contents, uri_to_markdown],
22 )
23
24 # Process the model's response
25 for tool_call in response.message.tool_calls or []:
26 function_to_call = available_functions.get(tool_call.function.name)
27 print(f"{function_to_call=}")
28 if function_to_call:
29 result = function_to_call(**tool_call.function.arguments)
30 print(f"Output of {tool_call.function.name}: {result}")
31 else:
32 print(f"Function {tool_call.function.name} not found.")
This code demonstrates the integration of a local LLM with custom tool functions for file system operations and web content processing. It imports three utility functions for listing directories, reading file contents, and converting URLs to markdown, then maps them to a dictionary for easy access.
The main execution flow involves sending a user prompt to the Ollama hosted model (here we are using the small IBM “granite3-dense” model), which requests directory listing, file reading, and URL conversion operations. The code then processes the model’s response by iterating through any tool calls returned, executing the corresponding functions, and printing their results. Error handling is included for cases where requested functions aren’t found in the available tools dictionary.
Here is sample output from using these three tools (most output removed for brevity and blank lines added for clarity):
1 $ python ollama_tools_examples.py
2
3 function_to_call=<function read_file_contents at 0x104fac9a0>
4
5 Output of read_file_contents: {'content': 'git+https://github.com/mark-watson/Ollama_Tools.git\nrequests\nbeautifulsoup4\naisuite[ollama]\n\n', 'size': 93, 'exists': True, 'error': None}
6
7 function_to_call=<function list_directory at 0x1050389a0>
8 Output of list_directory: {'files': ['.git', '.gitignore', 'LICENSE', 'Makefile', 'README.md', 'ollama_tools_examples.py', 'requirements.txt', 'venv'], 'count': 8, 'current_dir': '/Users/markw/GITHUB/Ollama-book-examples', 'error': None}
9
10 function_to_call=<function uri_to_markdown at 0x105038c20>
11
12 Output of uri_to_markdown: {'content': 'Read My Blog on Blogspot\n\nRead My Blog on Substack\n\nConsulting\n\nFree Mentoring\n\nFun stuff\n\nMy Books\n\nOpen Source\n\n Privacy Policy\n\n# Mark Watson AI Practitioner and Consultant Specializing in Large Language Models, LangChain/Llama-Index Integrations, Deep Learning, and the Semantic Web\n\n# I am the author of 20+ books on Artificial Intelligence, Python, Common Lisp, Deep Learning, Haskell, Clojure, Java, Ruby, Hy language, and the Semantic Web. I have 55 US Patents.\n\nMy customer list includes: Google, Capital One, Babylist, Olive AI, CompassLabs, Mind AI, Disney, SAIC, Americast, PacBell, CastTV, Lutris Technology, Arctan Group, Sitescout.com, Embed.ly, and Webmind Corporation.
13
14 ...
15
16 # Fun stuff\n\nIn addition to programming and writing my hobbies are cooking,\n photography, hiking, travel, and playing the following musical instruments: guitar, didgeridoo, and American Indian flute:\n\nMy guitar playing: a boogie riff\n\nMy didgeridoo playing\n\nMy Spanish guitar riff\n\nPlaying with George (didgeridoo), Carol and Crystal (drums and percussion) and Mark (Indian flute)\n\n# Open Source\n\nMy Open Source projects are hosted on my github account so please check that out!
17
18 ...
19
20 Hosted on Cloudflare Pages\n\n © Mark Watson 1994-2024\n\nPrivacy Policy', 'title': 'Mark Watson: AI Practitioner and Author of 20+ AI Books | Mark Watson', 'error': None}
Please note that the text extracted from a web page is mostly plain text. Section heads are maintained but the format is changed to markdown format. In the last (edited for brevity) listing, the HTML H1 element with the text Fun Stuff is converted to markdown:
1 # Fun stuff
2
3 In addition to programming and writing my hobbies are cooking,
4 photography, hiking, travel, and playing the following musical
5 instruments: guitar, didgeridoo, and American Indian flute ...
You have now looked at example tool use. We will now implement the several tools in this chapter and the next. We will look at the first tool for reading and writing files in fine detail and then more briefly discuss the other tools in the https://github.com/mark-watson/Ollama_in_Action_Book repository in the source-code directory.
Tool for Reading and Writing File Contents
This tool is meant to be combined with other tools, for example a summarization tool and a file reading tool might be used to process a user prompt to summarize a specific local file on your laptop.
Here is the contents of tool utility tool_file_contents.py:
1 """
2 Provides functions for reading and writing file contents with proper error handling
3 """
4
5 from pathlib import Path
6 import json
7
8
9 def read_file_contents(file_path: str, encoding: str = "utf-8") -> str:
10 """
11 Reads contents from a file and returns the text
12
13 Args:
14 file_path (str): Path to the file to read
15 encoding (str): File encoding to use (default: utf-8)
16
17 Returns:
18 Contents of the file as a string
19 """
20 try:
21 path = Path(file_path)
22 if not path.exists():
23 return f"File not found: {file_path}"
24
25 with path.open("r", encoding=encoding) as f:
26 content = f.read()
27 return f"Contents of file '{file_path}' is:\n{content}\n"
28
29 except Exception as e:
30 return f"Error reading file '{file_path}' is: {str(e)}"
31
32
33 def write_file_contents(
34 file_path: str, content: str,
35 encoding: str = "utf-8",
36 mode: str = "w") -> str:
37 """
38 Writes content to a file and returns operation status
39
40 Args:
41 file_path (str): Path to the file to write
42 content (str): Content to write to the file
43 encoding (str): File encoding to use (default: utf-8)
44 mode (str): Write mode ('w' for write, 'a' for append)
45
46 Returns:
47 a message string
48 """
49 try:
50 path = Path(file_path)
51
52 # Create parent directories if they don't exist
53 path.parent.mkdir(parents=True, exist_ok=True)
54
55 with path.open(mode, encoding=encoding) as f:
56 bytes_written = f.write(content)
57
58 return f"File '{file_path}' written OK."
59
60 except Exception as e:
61 return f"Error writing file '{file_path}': {str(e)}"
62
63
64 # Function metadata for Ollama integration
65 read_file_contents.metadata = {
66 "name": "read_file_contents",
67 "description": "Reads contents from a file and returns the content as a string",
68 "parameters": {"file_path": "Path to the file to read"},
69 }
70
71 write_file_contents.metadata = {
72 "name": "write_file_contents",
73 "description": "Writes content to a file and returns operation status",
74 "parameters": {
75 "file_path": "Path to the file to write",
76 "content": "Content to write to the file",
77 "encoding": "File encoding (default: utf-8)",
78 "mode": 'Write mode ("w" for write, "a" for append)',
79 },
80 }
81
82 # Export the functions
83 __all__ = ["read_file_contents", "write_file_contents"]
read_file_contents
This function provides file reading capabilities with robust error handling with parameters:
- file_path (str): Path to the file to read
- encoding (str, optional): File encoding (defaults to “utf-8”)
Features:
- Uses pathlib.Path for cross-platform path handling
- Checks file existence before attempting to read
- Returns file contents with descriptive message
- Comprehensive error handling
LLM Integration:
- Includes metadata for function discovery
- Returns descriptive string responses instead of raising exceptions
write_file_contents
This function handles file writing operations with built-in safety features. The parameters are:
- file_path (str): Path to the file to write
- content (str): Content to write to the file
- encoding (str, optional): File encoding (defaults to “utf-8”)
- mode (str, optional): Write mode (‘w’ for write, ‘a’ for append)
Features:
- Automatically creates parent directories
- Supports write and append modes
- Uses context managers for safe file handling
- Returns operation status messages
LLM Integration:
- Includes detailed metadata for function calling
- Provides clear feedback about operations
Common Features of both functions:
- Type hints for better code clarity
- Detailed docstrings that are used at runtime in the tool/function calling code. The text in the doc strings is supplied as context to the LLM currently in use.
- Proper error handling
- UTF-8 default encoding
- Context managers for file operations
- Metadata for LLM function discovery
Design Benefits for LLM Integration: the utilities are optimized for LLM function calling by:
- Returning descriptive string responses
- Including metadata for function discovery
- Handling errors gracefully
- Providing clear operation feedback
- Using consistent parameter patterns
Tool for Getting File Directory Contents
This tool is similar to the last tool so here we just list the worker function from the file tool_file_dir.py:
1 def list_directory(pattern: str = "*", list_dots=None) -> Dict[str, Any]:
2 """
3 Lists files and directories in the current working directory
4
5 Args:
6 pattern (str): Glob pattern for filtering files (default: "*")
7
8 Returns:
9 string with directory name, followed by list of files in the directory
10 """
11 try:
12 current_dir = Path.cwd()
13 files = list(current_dir.glob(pattern))
14
15 # Convert Path objects to strings and sort
16 file_list = sorted([str(f.name) for f in files])
17
18 file_list = [file for file in file_list if not file.endswith("~")]
19 if not list_dots:
20 file_list = [file for file in file_list if not file.startswith(".")]
21
22 return f"Contents of current directory: [{', '.join(file_list)}]"
23
24 except Exception as e:
25 return f"Error listing directory: {str(e)}"
Tool for Accessing SQLite Databases Using Natural Language Queries
The example file tool_sqlite.py serves two purposes here:
- Test and example code: utility function _create_sample_data creates several database tables and the function main serves as an example program.
- The Python class definitions SQLiteTool and OllamaFunctionCaller are meant to be copied and used in your applications.
1 import sqlite3
2 import json
3 from typing import Dict, Any, List, Optional
4 import ollama
5 from functools import wraps
6 import re
7 from contextlib import contextmanager
8 from textwrap import dedent # for multi-line string literals
9
10 class DatabaseError(Exception):
11 """Custom exception for database operations"""
12 pass
13
14
15 def _create_sample_data(cursor): # Helper function to create sample data
16 """Create sample data for tables"""
17 sample_data = {
18 'example': [
19 ('Example 1', 10.5),
20 ('Example 2', 25.0)
21 ],
22 'users': [
23 ('Bob', 'bob@example.com'),
24 ('Susan', 'susan@test.net')
25 ],
26 'products': [
27 ('Laptop', 1200.00),
28 ('Keyboard', 75.50)
29 ]
30 }
31
32 for table, data in sample_data.items():
33 for record in data:
34 if table == 'example':
35 cursor.execute(
36 "INSERT INTO example (name, value) VALUES (?, ?) ON CONFLICT DO NOTHING",
37 record
38 )
39 elif table == 'users':
40 cursor.execute(
41 "INSERT INTO users (name, email) VALUES (?, ?) ON CONFLICT DO NOTHING",
42 record
43 )
44 elif table == 'products':
45 cursor.execute(
46 "INSERT INTO products (product_name, price) VALUES (?, ?) ON CONFLICT DO NOTHING",
47 record
48 )
49
50
51 class SQLiteTool:
52 _instance = None
53
54 def __new__(cls, *args, **kwargs):
55 if not isinstance(cls._instance, cls):
56 cls._instance = super(SQLiteTool, cls).__new__(cls)
57 return cls._instance
58
59 def __init__(self, default_db: str = "test.db"):
60 if hasattr(self, 'default_db'): # Skip initialization if already done
61 return
62 self.default_db = default_db
63 self._initialize_database()
64
65 @contextmanager
66 def get_connection(self):
67 """Context manager for database connections"""
68 conn = sqlite3.connect(self.default_db)
69 try:
70 yield conn
71 finally:
72 conn.close()
73
74 def _initialize_database(self):
75 """Initialize database with tables"""
76 tables = {
77 'example': """
78 CREATE TABLE IF NOT EXISTS example (
79 id INTEGER PRIMARY KEY,
80 name TEXT,
81 value REAL
82 );
83 """,
84 'users': """
85 CREATE TABLE IF NOT EXISTS users (
86 id INTEGER PRIMARY KEY,
87 name TEXT,
88 email TEXT UNIQUE
89 );
90 """,
91 'products': """
92 CREATE TABLE IF NOT EXISTS products (
93 id INTEGER PRIMARY KEY,
94 product_name TEXT,
95 price REAL
96 );
97 """
98 }
99
100 with self.get_connection() as conn:
101 cursor = conn.cursor()
102 for table_sql in tables.values():
103 cursor.execute(table_sql)
104 conn.commit()
105 _create_sample_data(cursor)
106 conn.commit()
107
108 def get_tables(self) -> List[str]:
109 """Get list of tables in the database"""
110 with self.get_connection() as conn:
111 cursor = conn.cursor()
112 cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
113 return [table[0] for table in cursor.fetchall()]
114
115 def get_table_schema(self, table_name: str) -> List[tuple]:
116 """Get schema for a specific table"""
117 with self.get_connection() as conn:
118 cursor = conn.cursor()
119 cursor.execute(f"PRAGMA table_info({table_name});")
120 return cursor.fetchall()
121
122 def execute_query(self, query: str) -> List[tuple]:
123 """Execute a SQL query and return results"""
124 with self.get_connection() as conn:
125 cursor = conn.cursor()
126 try:
127 cursor.execute(query)
128 return cursor.fetchall()
129 except sqlite3.Error as e:
130 raise DatabaseError(f"Query execution failed: {str(e)}")
131
132 class OllamaFunctionCaller:
133 def __init__(self, model: str = "llama3.2:latest"):
134 self.model = model
135 self.sqlite_tool = SQLiteTool()
136 self.function_definitions = self._get_function_definitions()
137
138 def _get_function_definitions(self) -> Dict:
139 return {
140 "query_database": {
141 "description": "Execute a SQL query on the database",
142 "parameters": {
143 "type": "object",
144 "properties": {
145 "query": {
146 "type": "string",
147 "description": "The SQL query to execute"
148 }
149 },
150 "required": ["query"]
151 }
152 },
153 "list_tables": {
154 "description": "List all tables in the database",
155 "parameters": {
156 "type": "object",
157 "properties": {}
158 }
159 }
160 }
161
162 def _generate_prompt(self, user_input: str) -> str:
163 prompt = dedent(f"""
164 You are a SQL assistant. Based on the user's request, generate a JSON response that calls the appropriate function.
165 Available functions: {json.dumps(self.function_definitions, indent=2)}
166
167 User request: {user_input}
168
169 Respond with a JSON object containing:
170 - "function": The function name to call
171 - "parameters": The parameters for the function
172
173 Response:
174 """).strip()
175 return prompt
176
177 def _parse_ollama_response(self, response: str) -> Dict[str, Any]:
178 try:
179 json_match = re.search(r'\{.*\}', response, re.DOTALL)
180 if not json_match:
181 raise ValueError("No valid JSON found in response")
182 return json.loads(json_match.group())
183 except json.JSONDecodeError as e:
184 raise ValueError(f"Invalid JSON in response: {str(e)}")
185
186 def process_request(self, user_input: str) -> Any:
187 try:
188 response = ollama.generate(model=self.model, prompt=self._generate_prompt(user_input))
189 function_call = self._parse_ollama_response(response.response)
190
191 if function_call["function"] == "query_database":
192 return self.sqlite_tool.execute_query(function_call["parameters"]["query"])
193 elif function_call["function"] == "list_tables":
194 return self.sqlite_tool.get_tables()
195 else:
196 raise ValueError(f"Unknown function: {function_call['function']}")
197 except Exception as e:
198 raise RuntimeError(f"Request processing failed: {str(e)}")
199
200 def main():
201 function_caller = OllamaFunctionCaller()
202 queries = [
203 "Show me all tables in the database",
204 "Get all users from the users table",
205 "What are the top 5 products by price?"
206 ]
207
208 for query in queries:
209 try:
210 print(f"\nQuery: {query}")
211 result = function_caller.process_request(query)
212 print(f"Result: {result}")
213 except Exception as e:
214 print(f"Error processing query: {str(e)}")
215
216 if __name__ == "__main__":
217 main()
This code provides a natural language interface for interacting with an SQLite database. It uses a combination of Python classes, SQLite, and Ollama for running a language model to interpret user queries and execute corresponding database operations. Below is a breakdown of the code:
- Database Setup and Error Handling: a custom exception class, DatabaseError, is defined to handle database-specific errors. The database is initialized with three tables: example, users, and products. These tables are populated with sample data for demonstration purposes.
- SQLiteTool Class: the SQLiteTool class is a singleton that manages all SQLite database operations. Key features include:–Singleton Pattern: Ensures only one instance of the class is created.–Database Initialization: Creates tables (example, users, products) if they do not already exist.–Sample Data: Populates the tables with predefined sample data.–Context Manager: Safely manages database connections using a context manager.
Utility Methods:
- get_tables: Retrieves a list of all tables in the database.
- get_table_schema: Retrieves the schema of a specific table.
- execute_query: Executes a given SQL query and returns the results.
Sample Data Creation:
A helper function, _create_sample_data, is used to populate the database with sample data. It inserts records into the example, users, and products tables. This ensures the database has some initial data for testing and demonstration.
OllamaFunctionCaller Class:
The OllamaFunctionCaller class acts as the interface between natural language queries and database operations. Key components include:
- Integration with Ollama LLM: Uses the Ollama language model to interpret natural language queries.
- Function Definitions: Defines two main functions:–query_database: Executes SQL queries on the database.–list_tables: Lists all tables in the database.
- Prompt Generation: Converts user input into a structured prompt for the language model.
- Response Parsing: Parses the language model’s response into a JSON object that specifies the function to call and its parameters.
- Request Processing: Executes the appropriate database operation based on the parsed response.
Function Definitions:
The OllamaFunctionCaller class defines two main functions that can be called based on user input:
- query_database: Executes a SQL query provided by the user and returns the results of the query.
- list_tables: Lists all tables in the database and is useful for understanding the database structure.
Request Processing Workflow:
The process_request method handles the entire workflow of processing a user query:
- Input: Takes a natural language query from the user.
- Prompt Generation: Converts the query into a structured prompt for the Ollama language model.
- Response Parsing: Parses the language model’s response into a JSON object.
- Function Execution: Calls the appropriate function (query_database or list_tables) based on the parsed response.
- Output: Returns the results of the database operation.
Main test/example function:
The main function demonstrates how the system works with sample queries. It initializes the OllamaFunctionCaller and processes a list of example queries, such as:
- “Show me all tables in the database.“
- “Get all users from the users table.“
- “What are the top 5 products by price?“
For each query, the system interprets the natural language input, executes the corresponding database operation, and prints the results.
Summary:
This code creates a natural language interface for interacting with an SQLite database. It works as follows:
- Database Management: The SQLiteTool class handles all database operations, including initialization, querying, and schema inspection.
- Natural Language Processing: The OllamaFunctionCaller uses the Ollama language model to interpret user queries and map them to database functions.
- Execution: The system executes the appropriate database operation and returns the results to the user.
This approach allows users to interact with the database using natural language instead of writing SQL queries directly, making it more user-friendly and accessible.
The output looks like this:
1 python /Users/markw/GITHUB/Ollama_in_Action_Book/source-code/tool_sqlite.py
2
3 Query: Show me all tables in the database
4 Result: ['example', 'users', 'products']
5
6 Query: Get all users from the users table
7 Result: [(1, 'Bob', 'bob@example.com'), (2, 'Susan', 'susan@test.net')]
8
9 Query: What are the top 5 products by price?
10 Result: [(1, 'Laptop', 1200.0), (3, 'Laptop', 1200.0), (2, 'Keyboard', 75.5), (4, 'Keyboard', 75.5)]
Tool for Summarizing Text
Tools that are used by LLMs can themselves also use other LLMs. The tool defined in the file tool_summarize_text.py might be triggered by a user prompt such as “summarize the text in local file test1.txt” of “summarize text from web page https://markwatson.com” where it is used by other tools like reading a local file contents, fetching a web page, etc.
We will start by looking at the file tool_summarize_text.py and then look at an example in example_chain_web_summary.py.
1 """
2 Summarize text
3 """
4
5 from ollama import ChatResponse
6 from ollama import chat
7
8
9 def summarize_text(text: str, context: str = "") -> str:
10 """
11 Summarizes text
12
13 Parameters:
14 text (str): text to summarize
15 context (str): another tool's output can at the application layer can be used set the context for this tool.
16
17 Returns:
18 a string of summarized text
19
20 """
21 prompt = "Summarize this text (and be concise), returning only the summary with NO OTHER COMMENTS:\n\n"
22 if len(text.strip()) < 50:
23 text = context
24 elif len(context) > 50:
25 prompt = f"Given this context:\n\n{context}\n\n" + prompt
26
27 summary: ChatResponse = chat(
28 model="llama3.2:latest",
29 messages=[
30 {"role": "system", "content": prompt},
31 {"role": "user", "content": text},
32 ],
33 )
34 return summary["message"]["content"]
35
36
37 # Function metadata for Ollama integration
38 summarize_text.metadata = {
39 "name": "summarize_text",
40 "description": "Summarizes input text",
41 "parameters": {"text": "string of text to summarize",
42 "context": "optional context string"},
43 }
44
45 # Export the functions
46 __all__ = ["summarize_text"]
This Python code implements a text summarization tool using the Ollama chat model. The core function summarize_text takes two parameters: the main text to summarize and an optional context string. The function operates by constructing a prompt that instructs the model to provide a concise summary without additional commentary. It includes an interesting logic where if the input text is very short (less than 50 characters), it defaults to using the context parameter instead. Additionally, if there’s substantial context provided (more than 50 characters), it prepends this context to the prompt. The function utilizes the Ollama chat model “llama3.2:latest” to generate the summary, structuring the request with a system message containing the prompt and a user message containing the text to be summarized. The program includes metadata for Ollama integration, specifying the function name, description, and parameter details, and exports the summarize_text function through all.
Here is an example of using this tool that you can find in the file example_chain_web_summary.py. Please note that this example also uses the web search tool that is discussed in the next section.
1 from tool_web_search import uri_to_markdown
2 from tool_summarize_text import summarize_text
3
4 from pprint import pprint
5
6 import ollama
7
8 # Map function names to function objects
9 available_functions = {
10 "uri_to_markdown": uri_to_markdown,
11 "summarize_text": summarize_text,
12 }
13
14 memory_context = ""
15 # User prompt
16 user_prompt = "Get the text of 'https://knowledgebooks.com' and then summarize the text."
17
18 # Initiate chat with the model
19 response = ollama.chat(
20 model='llama3.2:latest',
21 messages=[{"role": "user", "content": user_prompt}],
22 tools=[uri_to_markdown, summarize_text],
23 )
24
25 # Process the model's response
26
27 pprint(response.message.tool_calls)
28
29 for tool_call in response.message.tool_calls or []:
30 function_to_call = available_functions.get(tool_call.function.name)
31 print(
32 f"\n***** {function_to_call=}\n\nmemory_context[:70]:\n\n{memory_context[:70]}\n\n*****\n"
33 )
34 if function_to_call:
35 print()
36 if len(memory_context) > 10:
37 tool_call.function.arguments["context"] = memory_context
38 print("\n* * tool_call.function.arguments:\n")
39 pprint(tool_call.function.arguments)
40 print(f"Arguments for {function_to_call.__name__}: {tool_call.function.arguments}")
41 result = function_to_call(**tool_call.function.arguments) # , memory_context)
42 print(f"\n\n** Output of {tool_call.function.name}: {result}")
43 memory_context = memory_context + "\n\n" + result
44 else:
45 print(f"\n\n** Function {tool_call.function.name} not found.")
Here is the output edited for brevity:
1 python /Users/markw/GITHUB/Ollama_in_Action_Book/source-code/example_chain_web_summary.py
2 [ToolCall(function=Function(name='uri_to_markdown', arguments={'a_uri': 'https://knowledgebooks.com'})),
3 ToolCall(function=Function(name='summarize_text', arguments={'context': '', 'text': 'uri_to_markdown(a_uri = "https://knowledgebooks.com")'}))]
4
5 ***** function_to_call=<function uri_to_markdown at 0x1047da200>
6
7 memory_context[:70]:
8
9
10
11 *****
12
13
14
15 * * tool_call.function.arguments:
16
17 {'a_uri': 'https://knowledgebooks.com'}
18 Arguments for uri_to_markdown: {'a_uri': 'https://knowledgebooks.com'}
19 INFO:httpx:HTTP Request: POST http://127.0.0.1:11434/api/chat "HTTP/1.1 200 OK"
20
21
22 ** Output of uri_to_markdown: Contents of URI https://knowledgebooks.com is:
23 # KnowledgeBooks.com - research on the Knowledge Management, and the Semantic Web
24
25 KnowledgeBooks.com - research on the Knowledge Management, and the Semantic Web
26
27 KnowledgeBooks.com
28
29 Knowledgebooks.com
30 a sole proprietorship company owned by Mark Watson
31 to promote Knowledge Management, Artificial Intelligence (AI), NLP, and Semantic Web technologies.
32
33 Site updated: December 1, 2018
34 With the experience of working on Machine Learning and Knowledge Graph applications for 30 years (at Google,
35 Capital One, SAIC, Compass Labs, etc.) I am now concerned that the leverage of deep learning and knowledge
36 representation technologies are controlled by a few large companies, mostly in China and the USA. I am proud
37 to be involved organizations like Ocean Protocol and Common Crawl that seek tp increase the availability of quality data
38 to individuals and smaller organizations.
39 Traditional knowledge management tools relied on structured data often stored in relational databases. Adding
40 new relations to this data would require changing the schemas used to store data which could negatively
41 impact exisiting systems that used that data. Relationships between data in traditional systems was
42 predefined by the structure/schema of stored data. With RDF and OWL based data modeling, relationships in
43 data are explicitly defined in the data itself. Semantic data is inherently flexible and extensible: adding
44 new data and relationships is less likely to break older systems that relied on the previous verisons of
45 data.
46 A complementary technology for knowledge management is the automated processing of unstructured text data
47 into semantic data using natural language processing (NLP) and statistical-base text analytics.
48 We will help you integrate semantic web and text analytics technologies into your organization by working
49 with your staff in a mentoring role and also help as needed with initial development. All for reasonable consulting rates
50 Knowledgebooks.com Technologies:
51
52 SAAS KnowledgeBooks Semantic NLP Portal (KBSportal.com) used for
53 in-house projects and available as a product to run on your servers.
54 Semantic Web Ontology design and development
55 Semantic Web application design and development using RDF data stores, PostgreSQL, and MongoDB.
56
57 Research
58 Natural Language Processing (NLP) using deep learning
59 Fusion of classic symbolic AI systems with deep learning models
60 Linked data, semantic web, and Ontology's
61 News ontology
62 Note: this ontology was created in 2004 using the Protege modeling tool.
63 About
64 KnowledgeBooks.com is owned as a sole proprietor business by Mark and Carol Watson.
65 Mark Watson is an author of 16 published books and a consultant specializing in the JVM platform
66 (Java, Scala, JRuby, and Clojure), artificial intelligence, and the Semantic Web.
67 Carol Watson helps prepare training data and serves as the editor for Mark's published books.
68 Privacy policy: this site collects no personal data or information on site visitors
69 Hosted on Cloudflare Pages.
70
71
72 ***** function_to_call=<function summarize_text at 0x107519260>
73
74 memory_context[:70]:
75
76
77
78 Contents of URI https://knowledgebooks.com is:
79 # KnowledgeBooks.com
80
81 *****
82
83
84
85 * * tool_call.function.arguments:
86
87 {'context': '\n'
88 '\n'
89 'Contents of URI https://knowledgebooks.com is:\n'
90 '# KnowledgeBooks.com - research on the Knowledge Management, and '
91 'the Semantic Web \n'
92 '\n'
93 'KnowledgeBooks.com - research on the Knowledge Management, and '
94 ...
95 'Carol Watson helps prepare training data and serves as the editor '
96 "for Mark's published books.\n"
97 'Privacy policy: this site collects no personal data or '
98 'information on site visitors\n'
99 'Hosted on Cloudflare Pages.\n',
100 'text': 'uri_to_markdown(a_uri = "https://knowledgebooks.com")'}
101 Arguments for summarize_text: {'context': "\n\nContents of URI https://knowledgebooks.com is:\n# KnowledgeBooks.com - research on the Knowledge Management, and the Semantic Web \n\nKnowledgeBooks.com - research on the Knowledge Management, and the Semantic Web \n\nKnowledgeBooks.com \n\nKnowledgebooks.com \na sole proprietorship company owned by Mark Watson\nto promote Knowledge Management, Artificial Intelligence (AI), NLP, and Semantic Web technologies.
102
103 ...
104
105 \n\nResearch\nNatural Language Processing (NLP) using deep learning\nFusion of classic symbolic AI systems with deep learning models\nLinked data, semantic web, and Ontology's\nNews ontology\nNote: this ontology was created in 2004 using the Protege modeling tool.\nAbout\nKnowledgeBooks.com is owned as a sole proprietor business by Mark and Carol Watson.\nMark Watson is an author of 16 published books and a consultant specializing in the JVM platform\n (Java, Scala, JRuby, and Clojure), artificial intelligence, and the Semantic Web.\nCarol Watson helps prepare training data and serves as the editor for Mark's published books.\nPrivacy policy: this site collects no personal data or information on site visitors\nHosted on Cloudflare Pages.\n", 'text': 'uri_to_markdown(a_uri = "https://knowledgebooks.com")'}
106
107
108 ** Output of summarize_text: # Knowledge Management and Semantic Web Research
109 ## About KnowledgeBooks.com
110 A sole proprietorship company by Mark Watson promoting AI, NLP, and Semantic Web technologies.
111 ### Technologies
112 - **SAAS KnowledgeBooks**: Semantic NLP Portal for in-house projects and product sales.
113 - **Semantic Web Development**: Ontology design and application development using RDF data stores.
114
115 ### Research Areas
116 - Natural Language Processing (NLP) with deep learning
117 - Fusion of symbolic AI systems with deep learning models
118 - Linked data, semantic web, and ontologies
Tool for Web Search and Fetching Web Pages
This code provides a set of functions for web searching and HTML content processing, with the main functions being uri_to_markdown, search_web, brave_search_summaries, and brave_search_text. The uri_to_markdown function fetches content from a given URI and converts HTML to markdown-style text, handling various edge cases and cleaning up the text by removing multiple blank lines and spaces while converting HTML entities. The search_web function is a placeholder that’s meant to be implemented with a preferred search API, while brave_search_summaries implements actual web searching using the Brave Search API, requiring an API key from the environment variables and returning structured results including titles, URLs, and descriptions. The brave_search_text function builds upon brave_search_summaries by fetching search results and then using uri_to_markdown to convert the content of each result URL to text, followed by summarizing the content using a separate summarize_text function. The code also includes utility functions like replace_html_tags_with_text which uses BeautifulSoup to strip HTML tags and return plain text, and includes proper error handling, logging, and type hints throughout. The module is designed to be integrated with Ollama and exports uri_to_markdown and search_web as its primary interfaces.
1 """
2 Provides functions for web searching and HTML to Markdown conversion
3 and for returning the contents of a URI as plain text (with minimal markdown)
4 """
5
6 from typing import Dict, Any
7 import requests
8 from bs4 import BeautifulSoup
9 import re
10 from urllib.parse import urlparse
11 import html
12 from ollama import chat
13 import json
14 from tool_summarize_text import summarize_text
15
16 import requests
17 import os
18 import logging
19 from pprint import pprint
20 from bs4 import BeautifulSoup
21
22 logging.basicConfig(level=logging.INFO)
23
24 api_key = os.environ.get("BRAVE_SEARCH_API_KEY")
25 if not api_key:
26 raise ValueError(
27 "API key not found. Set 'BRAVE_SEARCH_API_KEY' environment variable."
28 )
29
30
31 def replace_html_tags_with_text(html_string):
32 soup = BeautifulSoup(html_string, "html.parser")
33 return soup.get_text()
34
35
36 def uri_to_markdown(a_uri: str) -> Dict[str, Any]:
37 """
38 Fetches content from a URI and converts HTML to markdown-style text
39
40 Args:
41 a_uri (str): URI to fetch and convert
42
43 Returns:
44 web page text converted converted markdown content
45 """
46 try:
47 # Validate URI
48 parsed = urlparse(a_uri)
49 if not all([parsed.scheme, parsed.netloc]):
50 return f"Invalid URI: {a_uri}"
51
52 # Fetch content
53 headers = {
54 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
55 }
56 response = requests.get(a_uri, headers=headers, timeout=10)
57 response.raise_for_status()
58
59 # Parse HTML
60 soup = BeautifulSoup(response.text, "html.parser")
61
62 # Get title
63 title = soup.title.string if soup.title else ""
64
65 # Get text and clean up
66 text = soup.get_text()
67
68 # Clean up the text
69 text = re.sub(r"\n\s*\n", "\n\n", text) # Remove multiple blank lines
70 text = re.sub(r" +", " ", text) # Remove multiple spaces
71 text = html.unescape(text) # Convert HTML entities
72 text = text.strip()
73
74 return f"Contents of URI {a_uri} is:\n# {title}\n\n{text}\n"
75
76 except requests.RequestException as e:
77 return f"Network error: {str(e)}"
78
79 except Exception as e:
80 return f"Error processing URI: {str(e)}"
81
82
83 def search_web(query: str, max_results: int = 5) -> str:
84 """
85 Performs a web search and returns results
86 Note: This is a placeholder. Implement with your preferred search API.
87
88 Args:
89 query (str): Search query
90 max_results (int): Maximum number of results to return
91
92 Returns:
93 Dict[str, Any]: Dictionary containing:
94 - 'results': List of search results
95 - 'count': Number of results found
96 - 'error': Error message if any, None otherwise
97 """
98
99 # Placeholder for search implementation
100 return {
101 "results": [],
102 "count": 0,
103 "error": "Web search not implemented. Please implement with your preferred search API.",
104 }
105
106
107 def brave_search_summaries(
108 query,
109 num_results=3,
110 url="https://api.search.brave.com/res/v1/web/search",
111 api_key=api_key,
112 ):
113 headers = {"X-Subscription-Token": api_key, "Content-Type": "application/json"}
114 params = {"q": query, "count": num_results}
115
116 response = requests.get(url, headers=headers, params=params)
117 ret = []
118
119 if response.status_code == 200:
120 search_results = response.json()
121 ret = [
122 {
123 "title": result.get("title"),
124 "url": result.get("url"),
125 "description": replace_html_tags_with_text(result.get("description")),
126 }
127 for result in search_results.get("web", {}).get("results", [])
128 ]
129 logging.info("Successfully retrieved results.")
130 else:
131 try:
132 error_info = response.json()
133 logging.error(f"Error {response.status_code}: {error_info.get('message')}")
134 except json.JSONDecodeError:
135 logging.error(f"Error {response.status_code}: {response.text}")
136
137 return ret
138
139 def brave_search_text(query, num_results=3):
140 summaries = brave_search_summaries(query, num_results)
141 ret = ""
142 for s in summaries:
143 url = s["url"]
144 text = uri_to_markdown(url)
145 summary = summarize_text(
146 f"Given the query:\n\n{query}\n\nthen, summarize text removing all material that is not relevant to the query and then be very concise for a very short summary:\n\n{text}\n"
147 )
148 ret += ret + summary
149 print("\n\n-----------------------------------")
150 return ret
151
152 # Function metadata for Ollama integration
153 uri_to_markdown.metadata = {
154 "name": "uri_to_markdown",
155 "description": "Converts web page content to markdown-style text",
156 "parameters": {"a_uri": "URI of the web page to convert"},
157 }
158
159 search_web.metadata = {
160 "name": "search_web",
161 "description": "Performs a web search and returns results",
162 "parameters": {
163 "query": "Search query",
164 "max_results": "Maximum number of results to return",
165 },
166 }
167
168 # Export the functions
169 __all__ = ["uri_to_markdown", "search_web"]
Tools Wrap Up
We have looked at the implementations and examples uses for several tools. In the next chapter we continue our study of tool use with the application of judging the accuracy of output generated of LLMs: basically LLMs judging the accuracy of other LLMs to reduce hallucinations, inaccurate output, etc.
Automatic Evaluation of LLM Results: More Tool Examples
As Large Language Models (LLMs) become increasingly integrated into production systems and workflows, the ability to systematically evaluate their performance becomes crucial. While qualitative assessment of LLM outputs remains important, organizations need robust, quantitative methods to measure and compare model performance across different prompts, use cases, and deployment scenarios. This has led to the development of specialized tools and frameworks designed specifically for LLM evaluation.
The evaluation of LLM outputs presents unique challenges that set it apart from traditional natural language processing metrics. Unlike straightforward classification or translation tasks, LLM responses often require assessment across multiple dimensions, including factual accuracy, relevance, coherence, creativity, and adherence to specified formats or constraints. Furthermore, the stochastic nature of LLM outputs means that the same prompt can generate different responses across multiple runs, necessitating evaluation methods that can account for this variability.
Modern LLM evaluation tools address these challenges through a combination of automated metrics, human-in-the-loop validation, and specialized frameworks for prompt testing and response analysis. These tools can help developers and researchers understand how well their prompts perform, identify potential failure modes, and optimize prompt engineering strategies. By providing quantitative insights into LLM performance, these evaluation tools enable more informed decisions about model selection, prompt design, and system architecture in LLM-powered applications.
In this chapter we take a simple approach:
- Capture the chat history including output for an interaction with a LLM.
- Generate a prompt containing the chat history, model output, and a request to a different LLM to evaluate the output generated by the first LLM. We request that the final output of the second LLM is a score of ‘G’ or ‘B’ (good or bad) judging the accuracy of the first LLM’s output.
We look at several examples in this chapter of approaches you might want to experiment with.
Tool For Judging LLM Results
Here we implement our simple approach of using a second LLM to evaluate the output of the first LLM tat generated a response to user input.
The following listing shows the tool tool_judge_results.py:
1 """
2 Judge results from LLM generation from prompts
3 """
4
5 from typing import Optional, Dict, Any
6 from pathlib import Path
7 import json
8 import re
9 from pprint import pprint
10
11 import ollama
12
13 client = ollama.Client()
14
15 def judge_results(original_prompt: str, llm_gen_results: str) -> Dict[str, str]:
16 """
17 Takes an original prompt to a LLM and the output results
18
19 Args:
20 original_prompt (str): original prompt to a LLM
21 llm_gen_results (str): output from the LLM that this function judges for accuracy
22
23 Returns:
24 result: str: string that is one character with one of these values:
25 - 'B': Bad result
26 - 'G': A Good result
27 """
28 try:
29 messages = [
30 {"role": "system", "content": "Always judge this output for correctness."},
31 {"role": "user", "content": f"Evaluate this output:\n\n{llm_gen_results}\n\nfor this prompt:\n\n{original_prompt}\n\nDouble check your work and explain your thinking in a few sentences. End your output with a Y or N answer"},
32 ]
33
34 response = client.chat(
35 model="qwen2.5-coder:14b", # "llama3.2:latest",
36 messages=messages,
37 )
38
39 r = response.message.content.strip()
40 print(f"\n\noriginal COT response:\n\n{r}\n\n")
41
42 # look at the end of the response for the Y or N judgement
43 s = r.lower()
44 # remove all non-alphabetic characters:
45 s = re.sub(r'[^a-zA-Z]', '', s).strip()
46
47 return {'judgement': s[-1].upper(), 'reasoning': r[1:].strip()}
48
49 except Exception as e:
50 print(f"\n\n***** {e=}\n\n")
51 return {'judgement': 'E', 'reasoning': str(e)} # on any error, assign 'E' result
This Python code defines a function judge_results that takes an original prompt sent to a Large Language Model (LLM) and the generated response from the LLM, then attempts to judge the accuracy of the response.
Here’s a breakdown of the code:
The main function judge_results takes two parameters:
- original_prompt: The initial prompt sent to an LLM
- llm_gen_results: The output from the LLM that needs evaluation
The function judge_results returns a dictionary with two keys:
- judgement: Single character (‘B’ for Bad, ‘G’ for Good, ‘E’ for Error)
- reasoning: Detailed explanation of the judgment
The evaluation process is:
- Creates a conversation with two messages:–System message: Sets the context for evaluation–User message: Combines the original prompt and results for evaluation
- Uses the Qwen 2.5 Coder (14B parameter) model through Ollama
- Expects a Y/N response at the end of the evaluation
Sample output
1 $ cd OllamaEx
2 $ python example_judge.py
3
4 ==================================================
5 Judge output from a LLM
6 ==================================================
7
8 ==================================================
9 First test: should be Y, or good
10 ==================================================
11
12
13 original COT response:
14
15 The given output correctly calculates the absolute value of age differences for each pair:
16
17 - Sally (55) and John (18): \( |55 - 18| = 37 \)
18 - Sally (55) and Mary (31): \( |55 - 31| = 24 \)
19 - John (18) and Mary (31): \( |31 - 18| = 13 \)
20
21 These calculations are accurate, matching the prompt's requirements. Therefore, the answer is Y.
22
23
24
25 ** JUDGEMENT ***
26
27 judgement={'judgement': 'Y', 'reasoning': "The given output correctly calculates the absolute value of age differences for each pair:\n\n- Sally (55) and John (18): \\( |55 - 18| = 37 \\)\n- Sally (55) and Mary (31): \\( |55 - 31| = 24 \\)\n- John (18) and Mary (31): \\( |31 - 18| = 13 \\)\n\nThese calculations are accurate, matching the prompt's requirements. Therefore, the answer is Y."}
28
29 ==================================================
30 Second test: should be N, or bad
31 ==================================================
32
33
34 original COT response:
35
36 Let's evaluate the given calculations step by step:
37
38 1. Sally (55) - John (18) = 37. The difference is calculated as 55 - 18, which equals 37.
39 2. Sally (55) - Mary (31) = 24. The difference is calculated as 55 - 31, which equals 24.
40 3. John (18) - Mary (31) = -13. However, the absolute value of this difference is |18 - 31| = 13.
41
42 The given output shows:
43 - Sally and John: 55 - 18 = 31. This should be 37.
44 - Sally and Mary: 55 - 31 = 24. This is correct.
45 - John and Mary: 31 - 18 = 10. This should be 13.
46
47 The output contains errors in the first and third calculations. Therefore, the answer is:
48
49 N
50
51 ** JUDGEMENT ***
52
53 judgement={'judgement': 'N', 'reasoning': "et's evaluate the given calculations step by step:\n\n1. Sally (55) - John (18) = 37. The difference is calculated as 55 - 18, which equals 37.\n2. Sally (55) - Mary (31) = 24. The difference is calculated as 55 - 31, which equals 24.\n3. John (18) - Mary (31) = -13. However, the absolute value of this difference is |18 - 31| = 13.\n\nThe given output shows:\n- Sally and John: 55 - 18 = 31. This should be 37.\n- Sally and Mary: 55 - 31 = 24. This is correct.\n- John and Mary: 31 - 18 = 10. This should be 13.\n\nThe output contains errors in the first and third calculations. Therefore, the answer is:\n\nN"}
Evaluating LLM Responses Given a Chat History
Here we try a difference approach by asking the second “judge” LLM to evaluate the output of the first LLM based on specific criteria like “Response accuracy”, “Helpfulness”, etc.
The following listing shows the tool utility tool_llm_eval.py:
1 import json
2 from typing import List, Dict, Optional, Iterator
3 import ollama
4 from ollama import GenerateResponse
5
6
7 def clean_json_response(response: str) -> str:
8 """
9 Cleans the response string by removing markdown code blocks and other formatting
10 """
11 # Remove markdown code block indicators
12 response = response.replace("json", "").replace("```", "")
13 # Strip whitespace
14 response = response.strip()
15 return response
16
17 def evaluate_llm_conversation(
18 chat_history: List[Dict[str, str]],
19 evaluation_criteria: Optional[List[str]] = None,
20 model: str = "llama3.1" # older model that is good at generating JSON
21 ) -> Dict[str, any]:
22 """
23 Evaluates a chat history using Ollama to run the evaluation model.
24
25 Args:
26 chat_history: List of dictionaries containing the conversation
27 evaluation_criteria: Optional list of specific criteria to evaluate
28 model: Ollama model to use for evaluation
29
30 Returns:
31 Dictionary containing evaluation results
32 """
33 if evaluation_criteria is None:
34 evaluation_criteria = [
35 "Response accuracy",
36 "Coherence and clarity",
37 "Helpfulness",
38 "Task completion",
39 "Natural conversation flow"
40 ]
41
42 # Format chat history for evaluation
43 formatted_chat = "\n".join([
44 f"{'User' if msg['role'] == 'user' else 'Assistant'}: {msg['content']}"
45 for msg in chat_history
46 ])
47
48 # Create evaluation prompt
49 evaluation_prompt = f"""
50 Please evaluate the following conversation between a user and an AI assistant.
51 Focus on these criteria: {', '.join(evaluation_criteria)}
52
53 Conversation:
54 {formatted_chat}
55
56 Provide a structured evaluation with:
57 1. Scores (1-10) for each criterion
58 2. Brief explanation for each score
59 3. Overall assessment
60 4. Suggestions for improvement
61
62 Format your response as JSON.
63 """
64
65 try:
66 # Get evaluation from Ollama
67 response: GenerateResponse | Iterator[GenerateResponse] = ollama.generate(
68 model=model,
69 prompt=evaluation_prompt,
70 system="You are an expert AI evaluator. Provide detailed, objective assessments in JSON format."
71 )
72
73 response_clean: str = clean_json_response(response['response'])
74
75 # Parse the response to extract JSON
76 try:
77 evaluation_result = json.loads(response_clean)
78 except json.JSONDecodeError:
79 # Fallback if response isn't proper JSON
80 evaluation_result = {
81 "error": "Could not parse evaluation as JSON",
82 "raw_response": response_clean
83 }
84
85 return evaluation_result
86
87 except Exception as e:
88 return {
89 "error": f"Evaluation failed: {str(e)}",
90 "status": "failed"
91 }
92
93 # Example usage
94 if __name__ == "__main__":
95 # Sample chat history
96 sample_chat = [
97 {"role": "user", "content": "What's the capital of France?"},
98 {"role": "assistant", "content": "The capital of France is Paris."},
99 {"role": "user", "content": "Tell me more about it."},
100 {"role": "assistant", "content": "Paris is the largest city in France and serves as the country's political, economic, and cultural center. It's known for landmarks like the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral."}
101 ]
102
103 # Run evaluation
104 result = evaluate_llm_conversation(sample_chat)
105 print(json.dumps(result, indent=2))
We will use these five evaluation criteria:
- Response accuracy
- Coherence and clarity
- Helpfulness
- Task completion
- Natural conversation flow
The main function evaluate_llm_conversation uses these steps:
- Receives chat history and optional parameters
- Formats the conversation into a readable string
- Creates a detailed evaluation prompt
- Sends prompt to Ollama for evaluation
- Cleans and parses the response
- Returns structured evaluation results
Sample Output
1 $ cd OllamaEx
2 $ python tool_llm_eval.py
3 {
4 "evaluation": {
5 "responseAccuracy": {
6 "score": 9,
7 "explanation": "The assistant correctly answered the user's question about the capital of France, and provided accurate information when the user asked for more details."
8 },
9 "coherenceAndClarity": {
10 "score": 8,
11 "explanation": "The assistant's responses were clear and easy to understand. However, there was a slight shift in tone from a simple answer to a more formal description."
12 },
13 "helpfulness": {
14 "score": 9,
15 "explanation": "The assistant provided relevant information that helped the user gain a better understanding of Paris. The response was thorough and answered the user's follow-up question."
16 },
17 "taskCompletion": {
18 "score": 10,
19 "explanation": "The assistant completed both tasks: providing the capital of France and elaborating on it with additional context."
20 },
21 "naturalConversationFlow": {
22 "score": 7,
23 "explanation": "While the responses were clear, they felt a bit abrupt. The assistant could have maintained a more conversational tone or encouraged further discussion."
24 }
25 },
26 "overallAssessment": {
27 "score": 8.5,
28 "explanation": "The assistant demonstrated strong technical knowledge and was able to provide accurate information on demand. However, there were some minor lapses in natural conversation flow and coherence."
29 },
30 "suggestionsForImprovement": [
31 {
32 "improvementArea": "NaturalConversationFlow",
33 "description": "Consider using more conversational language or prompts to engage users further."
34 },
35 {
36 "improvementArea": "CoherenceAndClarity",
37 "description": "Use transitional phrases and maintain a consistent tone throughout the conversation."
38 }
39 ]
40 }
A Tool for Detecting Hallucinations
Here we use a text template file templates/anti_hallucinations.txt to define the prompt template for checking a user input, a context, and the resulting output by another LLM (most of the file is not shown for brevity):
1 You are a fair judge and an expert at identifying false hallucinations and you are tasked with evaluating the accuracy of an AI-generated answer to a given context. Analyze the provided INPUT, CONTEXT, and OUTPUT to determine if the OUTPUT contains any hallucinations or false information.
2
3 Guidelines:
4 1. The OUTPUT must not contradict any information given in the CONTEXT.
5 2. The OUTPUT must not introduce new information beyond what's provided in the CONTEXT.
6 3. The OUTPUT should not contradict well-established facts or general knowledge.
7 4. Check that the OUTPUT doesn't oversimplify or generalize information in a way that changes its meaning or accuracy.
8
9 Analyze the text thoroughly and assign a hallucination score between 0 and 1, where:
10 - 0.0: The OUTPUT is unfaithful or is incorrect to the CONTEXT and the user's INPUT
11 - 1.0: The OUTPUT is entirely accurate abd faithful to the CONTEXT and the user's INPUT
12
13 INPUT:
14 {input}
15
16 CONTEXT:
17 {context}
18
19 OUTPUT:
20 {output}
21
22 Provide your judgement in JSON format:
23 {{
24 "score": <your score between 0.0 and 1.0>,
25 "reason": [
26 <list your reasoning as Python strings>
27 ]
28 }}
Here is the tool tool_anti_hallucination.py that uses this template:
1 """
2 Provides functions detecting hallucinations by other LLMs
3 """
4
5 from typing import Optional, Dict, Any
6 from pathlib import Path
7 from pprint import pprint
8 import json
9 from ollama import ChatResponse
10 from ollama import chat
11
12 def read_anti_hallucination_template() -> str:
13 """
14 Reads the anti-hallucination template file and returns the content
15 """
16 template_path = Path(__file__).parent / "templates" / "anti_hallucination.txt"
17 with template_path.open("r", encoding="utf-8") as f:
18 content = f.read()
19 return content
20
21 TEMPLATE = read_anti_hallucination_template()
22
23 def detect_hallucination(user_input: str, context: str, output: str) -> str:
24 """
25 Given user input, context, and LLM output, detect hallucination
26
27 Args:
28 user_input (str): User's input text prompt
29 context (str): Context text for LLM
30 output (str): LLM's output text that is to be evaluated as being a hallucination)
31
32 Returns: JSON data:
33 {
34 "score": <your score between 0.0 and 1.0>,
35 "reason": [
36 <list your reasoning as bullet points>
37 ]
38 }
39 """
40 prompt = TEMPLATE.format(input=user_input, context=context, output=output)
41 response: ChatResponse = chat(
42 model="llama3.2:latest",
43 messages=[
44 {"role": "system", "content": prompt},
45 {"role": "user", "content": output},
46 ],
47 )
48 try:
49 return json.loads(response.message.content)
50 except json.JSONDecodeError:
51 print(f"Error decoding JSON: {response.message.content}")
52 return {"score": 0.0, "reason": ["Error decoding JSON"]}
53
54
55 # Export the functions
56 __all__ = ["detect_hallucination"]
57
58 ## Test only code:
59
60 def main():
61 def separator(title: str):
62 """Prints a section separator"""
63 print(f"\n{'=' * 50}")
64 print(f" {title}")
65 print('=' * 50)
66
67 # Test file writing
68 separator("Detect hallucination from a LLM")
69
70 test_prompt = "Sally is 55, John is 18, and Mary is 31. What are pairwise combinations of the absolute value of age differences?"
71 test_context = "Double check all math results."
72 test_output = "Sally and John: 55 - 18 = 31. Sally and Mary: 55 - 31 = 24. John and Mary: 31 - 18 = 10."
73 judgement = detect_hallucination(test_prompt, test_context, test_output)
74 print(f"\n** JUDGEMENT ***\n")
75 pprint(judgement)
76
77 if __name__ == "__main__":
78 try:
79 main()
80 except Exception as e:
81 print(f"An error occurred: {str(e)}")
This code implements a hallucination detection system for Large Language Models (LLMs) using the Ollama framework. The core functionality revolves around the detect_hallucination function, which takes three parameters: user input, context, and LLM output, and evaluates whether the output contains hallucinated content by utilizing another LLM (llama3.2) as a judge. The system reads a template from a file to structure the evaluation prompt.
The implementation includes type hints and error handling, particularly for JSON parsing of the response. The output is structured as a JSON object containing a hallucination score (between 0.0 and 1.0) and a list of reasoning points. The code also includes a test harness that demonstrates the system’s usage with a mathematical example, checking for accuracy in age difference calculations. The modular design allows for easy integration into larger systems through the explicit export of the detect_hallucination function.
The output looks something like this:
1 python /Users/markw/GITHUB/OllamaExamples/tool_anti_hallucination.py
2
3 ==================================================
4 Detect hallucination from a LLM
5 ==================================================
6
7 ** JUDGEMENT ***
8
9 {'reason': ['The OUTPUT claims that the absolute value of age differences are '
10 '31, 24, and 10 for Sally and John, Sally and Mary, and John and '
11 'Mary respectively. However, this contradicts the CONTEXT, as the '
12 'CONTEXT asks to double-check math results.',
13 'The OUTPUT does not introduce new information, but it provides '
14 'incorrect calculations: Sally and John: 55 - 18 = 37, Sally and '
15 'Mary: 55 - 31 = 24, John and Mary: 31 - 18 = 13. Therefore, the '
16 'actual output should be recalculated to ensure accuracy.',
17 'The OUTPUT oversimplifies the age differences by not considering '
18 "the order of subtraction (i.e., John's age subtracted from "
19 "Sally's or Mary's). However, this is already identified as a "
20 'contradiction in point 1.'],
21 'score': 0.0}
Wrap Up
Here we looked at several examples for using one LLM to rate the accuracy, usefulness, etc. of another LLM given an input prompt. There are two topics in this book that I spend most of my personal LLM research time on: automatic evaluation of LLM results, and tool using agents (the subject of the next chapter).
Building Agents with Ollama and the Hugging Face Smolagents Library
We have seen a few useful examples of tool use (function calling) and now we will build on tool use to build both single agents and multi-agent systems. There are commercial and open source resources to build agents, CrewAI and LangGraph being popular choices. We will follow a different learning path here, preferring to use the smolagents library. Please bookmark https://github.com/huggingface/smolagents for reference while working through this chapter.
Each example program and utility for this chapter uses the prefix smolagents_ in the Python file name.
Note: We are using the 2 GB model Llama3.2:latest here. Different models support tools and agents differently.
Choosing Specific LLMs for Writing Agents
As agents operate performing tasks like interpreting user input, performing Chain of Thought (Cot) reasoning, observe the output from calling tools, and following plan steps one by one, then LLMs errors, hallucinations, and inconsistencies accumulate. When using Ollama we prefer using the most powerful models that we can run on our hardware.
Here we use Llama3.2:latest that is recognized for its function calling capabilities, facilitating seamless integration with various tools.
As you work through the examples here using different local models running on Ollama, you might encounter compounding errors problems. When I am experimenting with ideas for implementing agents, I sometimes keep two versions of my code, one for a local model and one using eight of the commercial models GPT-4o or Claude Sonnet 3.5. Comparing the same agent setup using different models might provide some insight into runtime agent problems being your code or the model you are using.
Installation notes
As I write this chapter on January 2, 2025, smolagents needs to be run with an older version of Python:
1 python3.11 -m venv venv
2 source venv/bin/activate
3 python3.11 -m pip install -r requirements.txt
4 python3.11 smolagents_test.py
The first two lines of the requirements.txt file specify the smolagents specific requirements:
1 smolagents
2 litellm[proxy]
3 requests
4 beautifulsoup4
5 ollama
6 langchain
7 langchain-community
8 langchain-ollama
9 langgraph
10 rdflib
11 kuzu
12 langchain_openai
13 tabulate
Overview of the Hugging Face Smolagents Library
The smolagents library https://github.com/huggingface/smolagents is built around a minimalist and modular architecture that emphasizes simplicity and composability. The core components are cleanly separated into the file agents.py for agent definitions, tools.py for tool implementations, and related support files. This design philosophy allows developers to easily understand, extend, and customize the components while maintaining a small codebase footprint - true to the “smol” name.
This library implements a tools-first approach where capabilities are encapsulated as discrete tools that agents can use. The tools.py file in the smolagents implementation defines a clean interface for tools with input/output specifications, making it straightforward to add new tools. This tools-based architecture enables agents to have clear, well-defined capabilities while maintaining separation of concerns between the agent logic and the actual implementation of capabilities.
Agents are designed to be lightweight and focused on specific tasks rather than trying to be general-purpose. The BaseAgent class provides core functionality while specific agents like WebAgent extend it for particular use cases. This specialization allows the agents to be more efficient and reliable at their designated tasks rather than attempting to be jack-of-all-trades.
Overview for LLM Agents (optional section)
You might want to skip this section if you want to quickly work through the examples in this chapter and review this material later.
In general, we use the following steps to build agent based systems:
- Define agents (e.g., Researcher, Writer, Editor, Judge outputs of other models and agents).
- Assign tasks (e.g., research, summarize, write, double check the work of other agents).
- Use an orchestration framework to manage task sequencing and collaboration.
Features of Agents:
- Retrieval-Augmented Generation (RAG): Enhance agents’ knowledge by integrating external documents or databases.–Example: An agent that retrieves and summarizes medical research papers.
-
Memory Management: Enable agents to retain context across interactions.
- Example: A chatbot that remembers user preferences over time.
-
Tool Integration: Equip agents with tools like web search, data scraping, or API calls.
- Example: An agent that fetches real-time weather data and provides recommendations. We will use tools previously developed in this book.
Examples of Real-World Applications
- Healthcare: Agents that analyze medical records and provide diagnostic suggestions.
- Education: Virtual tutors that explain complex topics using Ollama’s local models.
- Customer Support: Chatbots that handle inquiries without relying on cloud services.
- Content Creation: Agents that generate articles, summaries, or marketing content.
Let’s Write Some Code
I am still experimenting with LLM-based agents. Please accept the following examples as my personal works in progress.
“Hello World” Smolagents Example
Here we look at a simple example taken from the smolagents documentation and converted to run using local models with Ollama. Here is a listing of file smolagents_test.py:
1 """
2 smolagents example program (slightly modified)
3 """
4
5 from smolagents.agents import ToolCallingAgent
6 from smolagents import tool, LiteLLMModel
7 from typing import Optional
8
9 model = LiteLLMModel(
10 model_id="ollama_chat/llama3.2:latest",
11 api_base="http://localhost:11434",
12 api_key="your-api-key" # not used
13 )
14
15 @tool
16 def get_weather(location: str, celsius: Optional[bool] = False) -> str:
17 """
18 Get weather in the next days at given location.
19 Secretly this tool does not care about the location, it hates the weather everywhere.
20
21 Args:
22 location: the location
23 celsius: the temperature
24 """
25 return "The weather is UNGODLY with torrential rains and temperatures below -10°C"
26
27 agent = ToolCallingAgent(tools=[get_weather], model=model)
28
29 print(agent.run("What's the weather like in Paris?"))
Understanding the smolagents and Ollama Example
This code demonstrates a simple integration between smolagents (a tool-calling framework) and Ollama (a local LLM server). Here’s what the code accomplishes: Core Components
Utilizes smolagents for creating AI agents with tool capabilities Integrates with a local Ollama server running llama3.2 Implements a basic weather checking tool (though humorously hardcoded)
Model Configuration
The code sets up a LiteLLM model instance that connects to a local Ollama server on port 11434. It’s configured to use the llama3.2 model and supports optional API key authentication.
Weather Tool Implementation
The code defines a weather-checking tool using the @tool decorator. While it accepts a location parameter and an optional celsius flag, this example version playfully returns the same dramatic weather report regardless of the input location.
Agent Setup and Execution
The implementation creates a ToolCallingAgent with the weather tool and the configured model. Users can query the agent about weather conditions in any location, though in this example it always returns the same humorous response about terrible weather conditions.
Key Features
Demonstrates tool-calling capabilities through smolagents Shows local LLM integration using Ollama Includes proper type hinting for better code clarity Provides an extensible structure for adding more tools
Python Tools Compatible with Smolagents
The tools I developed in previous chapters are not quite compatible with the smolagents library so I wrap a few of the tools I previously wrote in the utility smolagents_tools.py:
1 """
2 Wrapper for book example tools for smloagents compatibility
3 """
4 from pathlib import Path
5
6 from smolagents import tool, LiteLLMModel
7 from typing import Optional
8 from pprint import pprint
9
10 from tool_file_dir import list_directory
11
12 @tool
13 def sa_list_directory(list_dots: Optional[bool]=None) -> str:
14 """
15 Lists files and directories in the current working directory
16
17 Args:
18 list_dots: optional boolean (if true, include dot files)
19
20 Returns:
21 string with directory name, followed by list of files in the directory
22 """
23 lst = list_directory()
24 pprint(lst)
25 return lst
26
27 @tool
28 def read_file_contents(file_path: str) -> str:
29 """
30 Reads contents from a file and returns the text
31
32 Args:
33 file_path: Path to the file to read
34
35 Returns:
36 Contents of the file as a string
37 """
38 try:
39 path = Path(file_path)
40 if not path.exists():
41 return f"File not found: {file_path}"
42
43 with path.open("r", encoding="utf-8") as f:
44 content = f.read()
45 return f"Contents of file '{file_path}' is:\n{content}\n"
46
47 except Exception as e:
48 return f"Error reading file '{file_path}' is: {str(e)}"
49
50 @tool
51 def summarize_directory() -> str:
52 """
53 Summarizes the files and directories in the current working directory
54
55 Returns:
56 string with directory name, followed by summary of files in the directory
57 """
58 lst = list_directory()
59 num_files = len(lst)
60 num_dirs = len([x for x in lst if x[1] == 'directory'])
61 num_files = num_files - num_dirs
62 return f"Current directory contains {num_files} files and {num_dirs} directories."
This code defines a wrapper module containing three tool functions designed for compatibility with the smolagents framework. The module includes sa_list_directory(), which lists files and directories in the current working directory with an optional parameter to include dot files; read_file_contents(), which takes a file path as input and returns the contents of that file as a string while handling potential errors and file encoding; and summarize_directory(), which provides a concise summary of the current directory by counting the total number of files and directories. All functions are decorated with @tool for integration with smlolagents, and the code imports necessary modules including pathlib for file operations, typing for type hints, and pprint for formatted output. The functions rely on an external list_directory() function imported from tool_file_dir.py, and they provide clear documentation through docstrings explaining their parameters, functionality, and return values. Error handling is implemented particularly in the file reading function to gracefully handle cases where files don’t exist or cannot be read properly.
A Complete Smolagents Example using Three Tools
This listing shows the script smolagents_agent_test.py:
1 from smolagents.agents import ToolCallingAgent
2 from smolagents import tool, LiteLLMModel
3 from typing import Optional
4
5 from smolagents_tools import sa_list_directory
6 from smolagents_tools import summarize_directory
7 from smolagents_tools import read_file_contents
8
9 model = LiteLLMModel(
10 model_id="ollama_chat/llama3.2:latest",
11 api_base="http://localhost:11434",
12 api_key="your-api-key" # not used
13 )
14
15 agent = ToolCallingAgent(tools=[sa_list_directory,
16 summarize_directory,
17 read_file_contents],
18 model=model)
19
20 print(agent.run("What are the files in the current directory? Describe the current directory"))
21
22 print(agent.run("Which Python scripts evaluate the performance of LLMs?"))
This code demonstrates the creation of an AI agent using the smolagents library, specifically configured to work with file system operations. It imports three specialized tools from smolagents_tools: sa_list_directory for listing directory contents, summarize_directory for providing directory summaries, and read_file_contents for accessing file contents. The code sets up a LiteLLMModel instance that connects to a local Ollama server running the llama3.2 model on port 11434, with provisions for API key authentication if needed. A ToolCallingAgent is then created with these three file-system-related tools, enabling it to interact with and analyze the local file system. The agent is instructed to examine the current directory through a natural language query, asking for both a listing and description of the files present. There’s also a second section that would have asked the agent to specifically analyze Python programs in the directory and identify those related to LLM performance evaluation, showing the agent’s potential for more complex file analysis tasks. This setup effectively creates an AI-powered file system navigator that can understand and respond to natural language queries about directory contents and file analysis.
Output from the First Example: “List the Python programs in the current directory, and then tell me which Python programs in the current directory evaluate the performance of LLMs?”
In the following output please notice that sometimes tool use fails and occasionally wrong assumptions are made, but after a long chain or thought (CoT) process the final result is good.
The output for for the query “Which python scripts evaluate the performance of LLMs?” is:
1 python smolagents_agent_test1.py
2 ╭────────────────────────────────── New run ───────────────────────────────────╮
3 │ │
4 │ List the Python programs in the current directory, and then tell me which │
5 │ Python programs in the current directory evaluate the performance of LLMs? │
6 │ │
7 ╰─ LiteLLMModel - ollama_chat/llama3.2:latest ─────────────────────────────────╯
8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
9 ╭──────────────────────────────────────────────────────────────────────────────╮
10 │ Calling tool: 'sa_list_directory' with arguments: {'list_dots': True} │
11 ╰──────────────────────────────────────────────────────────────────────────────╯
12 ('Contents of current directory: [Makefile, README.md, __pycache__, data, '
13 'example_chain_read_summary.py, example_chain_web_summary.py, '
14 'example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py, '
15 'langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt, '
16 'short_programs, smolagents_agent_test1.py, smolagents_test.py, '
17 'smolagents_tools.py, templates, tool_anti_hallucination.py, '
18 'tool_file_contents.py, tool_file_dir.py, tool_judge_results.py, '
19 'tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, '
20 'tool_web_search.py, venv]')
21 Observations: Contents of current directory: [Makefile, README.md, __pycache__,
22 data, example_chain_read_summary.py, example_chain_web_summary.py,
23 example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py,
24 langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt,
25 short_programs, smolagents_agent_test1.py, smolagents_test.py,
26 smolagents_tools.py, templates, tool_anti_hallucination.py,
27 tool_file_contents.py, tool_file_dir.py, tool_judge_results.py,
28 tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, tool_web_search.py,
29 venv]
30 [Step 0: Duration 4.49 seconds| Input tokens: 1,347 | Output tokens: 79]
31
32 ...
33
34 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 3 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
35 ╭──────────────────────────────────────────────────────────────────────────────╮
36 │ Calling tool: 'sa_summarize_directory' with arguments: {} │
37 ╰──────────────────────────────────────────────────────────────────────────────╯
38 lst='Contents of current directory: [Makefile, README.md, __pycache__, data, example_chain_read_summary.py, example_chain_web_summary.py, example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py, langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt, short_programs, smolagents_agent_test1.py, smolagents_test.py, smolagents_tools.py, templates, tool_anti_hallucination.py, tool_file_contents.py, tool_file_dir.py, tool_judge_results.py, tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, tool_web_search.py, venv]'
39 response.message.content="Based on the file names provided, here's a summary of the contents and my educated guesses for their purposes:\n\n1. **Makefile**: A build script used to automate compilation, installation, or other tasks.\n2. **README.md**: A markdown file providing an introduction to the project, its purpose, and how to get started with it.\n3. **__pycache__**: This is a hidden directory generated by Python's bytecode compiler. It likely contains compiled versions of Python code in the current directory.\n4. **data**: A directory containing data used for training or testing models, simulations, or other computational tasks.\n5. **example_chain_read_summary.py**: A script that generates summaries from reading chains (e.g., text from a document). Its purpose is likely related to natural language processing (NLP) or text analysis.\n6. **example_chain_web_summary.py**: Similar to the previous one, but this script seems to be focused on web-based applications or online content summarization.\n7. **example_judge.py**: A script that evaluates judges or other aspects of a system. Its purpose might involve testing, validation, or quality control in a specific domain (e.g., AI, programming).\n8. **graph_kuzu_from_text.py**: A script that generates graphs from text data. This could be related to graph analysis, network science, or text mining.\n9. **graph_kuzu_property_example.py**: Another script related to graph generation, possibly used for demonstrating specific properties or characteristics of the generated graphs.\n10. **langgraph_agent_test.py**: A test script for a language graph agent, which likely involves machine learning, natural language processing, or other AI-related tasks.\n11. **ollama_tools_examples.py**: A file containing examples of OLLAMA (Open Language Learning with Multi-Agent) tools and their usage. This seems to be related to AI-powered language understanding and generation.\n12. **requirements.txt**: A text file specifying the dependencies required to run the project, including Python packages and other software.\n13. **short_programs**: A directory containing short programs or scripts that demonstrate specific techniques, concepts, or examples in the field of AI or programming.\n14. **smolagents_agent_test1.py**: Another test script for a small agent (SMOLAgent), which is likely involved in autonomous decision-making or other complex system simulations.\n15. **smolagents_test.py**: A general-purpose test file for SMOLAgents, used to verify their functionality and performance.\n16. **smolagents_tools.py**: A script containing tools or utilities specifically designed for working with SMOLAgents or related projects.\n17. **templates**: A directory containing templates for generating documents, reports, or other content in a specific format (e.g., LaTeX).\n18. **tool_anti_hallucination.py**: A script that addresses issues of hallucination (i.e., when an AI model provides false information) and aims to mitigate these errors.\n19. **tool_file_contents.py**, **tool_file_dir.py**, **tool_judge_results.py**, and other similar tool scripts: These tools seem to be designed for various computational tasks, such as:\n\t* `tool_file_contents.py`: Analyzing the contents of files.\n\t* `tool_file_dir.py`: Examining or manipulating directory structures.\n\t* `tool_judge_results.py`: Evaluating the performance or outcomes of a system or model.\n\n20. **tool_llm_eval.py**: A script for evaluating Large Language Models (LLMs) and their capabilities, likely involving text analysis, sentiment detection, or other NLP tasks.\n21. **tool_sqlite.py**: A tool that interacts with SQLite databases, possibly used for data storage, management, or querying.\n22. **tool_summarize_text.py**: A script designed to summarize long pieces of text into shorter versions, possibly using machine learning algorithms.\n23. **tool_web_search.py**: A tool that performs web searches or retrieves information from online sources, which could involve natural language processing (NLP) and web scraping techniques.\n\n24. **venv**: A directory generated by Python's virtual environment module, used to isolate dependencies and manage a specific Python environment for the project.\n\nKeep in mind that this is an educated guess based on common file name conventions and the context provided. The actual purposes of these files might differ depending on the specific project or domain they are related to."
40 Observations: Summary of directory:Based on the file names provided, here's a
41 summary of the contents and my educated guesses for their purposes:
42
43 1. **Makefile**: A build script used to automate compilation, installation, or
44 other tasks.
45 2. **README.md**: A markdown file providing an introduction to the project, its
46 purpose, and how to get started with it.
47 3. **__pycache__**: This is a hidden directory generated by Python's bytecode
48 compiler. It likely contains compiled versions of Python code in the current
49 directory.
50 4. **data**: A directory containing data used for training or testing models,
51 simulations, or other computational tasks.
52 5. **example_chain_read_summary.py**: A script that generates summaries from
53 reading chains (e.g., text from a document). Its purpose is likely related to
54 natural language processing (NLP) or text analysis.
55 6. **example_chain_web_summary.py**: Similar to the previous one, but this
56 script seems to be focused on web-based applications or online content
57 summarization.
58 7. **example_judge.py**: A script that evaluates judges or other aspects of a
59 system. Its purpose might involve testing, validation, or quality control in a
60 specific domain (e.g., AI, programming).
61 8. **graph_kuzu_from_text.py**: A script that generates graphs from text data.
62 This could be related to graph analysis, network science, or text mining.
63 9. **graph_kuzu_property_example.py**: Another script related to graph
64 generation, possibly used for demonstrating specific properties or
65 characteristics of the generated graphs.
66 10. **langgraph_agent_test.py**: A test script for a language graph agent, which
67 likely involves machine learning, natural language processing, or other
68 AI-related tasks.
69 11. **ollama_tools_examples.py**: A file containing examples of OLLAMA (Open
70 Language Learning with Multi-Agent) tools and their usage. This seems to be
71 related to AI-powered language understanding and generation.
72 12. **requirements.txt**: A text file specifying the dependencies required to
73 run the project, including Python packages and other software.
74 13. **short_programs**: A directory containing short programs or scripts that
75 demonstrate specific techniques, concepts, or examples in the field of AI or
76 programming.
77 14. **smolagents_agent_test1.py**: Another test script for a small agent
78 (SMOLAgent), which is likely involved in autonomous decision-making or other
79 complex system simulations.
80 15. **smolagents_test.py**: A general-purpose test file for SMOLAgents, used to
81 verify their functionality and performance.
82 16. **smolagents_tools.py**: A script containing tools or utilities specifically
83 designed for working with SMOLAgents or related projects.
84 17. **templates**: A directory containing templates for generating documents,
85 reports, or other content in a specific format (e.g., LaTeX).
86 18. **tool_anti_hallucination.py**: A script that addresses issues of
87 hallucination (i.e., when an AI model provides false information) and aims to
88 mitigate these errors.
89 19. **tool_file_contents.py**, **tool_file_dir.py**, **tool_judge_results.py**,
90 and other similar tool scripts: These tools seem to be designed for various
91 computational tasks, such as:
92 * `tool_file_contents.py`: Analyzing the contents of files.
93 * `tool_file_dir.py`: Examining or manipulating directory structures.
94 * `tool_judge_results.py`: Evaluating the performance or outcomes of a
95 system or model.
96
97 20. **tool_llm_eval.py**: A script for evaluating Large Language Models (LLMs)
98 and their capabilities, likely involving text analysis, sentiment detection, or
99 other NLP tasks.
100 21. **tool_sqlite.py**: A tool that interacts with SQLite databases, possibly
101 used for data storage, management, or querying.
102 22. **tool_summarize_text.py**: A script designed to summarize long pieces of
103 text into shorter versions, possibly using machine learning algorithms.
104 23. **tool_web_search.py**: A tool that performs web searches or retrieves
105 information from online sources, which could involve natural language processing
106 (NLP) and web scraping techniques.
107
108 24. **venv**: A directory generated by Python's virtual environment module, used
109 to isolate dependencies and manage a specific Python environment for the
110 project.
111
112 Keep in mind that this is an educated guess based on common file name
113 conventions and the context provided. The actual purposes of these files might
114 differ depending on the specific project or domain they are related to.
115 [Step 3: Duration 21.53 seconds| Input tokens: 6,779 | Output tokens: 133]
116
117 ...
118
119 Reached max iterations.
120 Final answer: [{'id': '9630af1b-498f-4604-ab1b-e4139607cb02', 'type':
121 'function', 'function': {'name': 'sa_list_directory', 'arguments': {'list_dots':
122 True}}}, {'id': '2143745d-5ec1-4711-8813-228398bf36f8', 'type': 'function',
123 'function': {'name': 'sa_summarize_directory', 'arguments': {}}}]
124
125 ...
126
127 The Python programs in the current directory are:
128 1. example_chain_read_summary.py
129 2. example_chain_web_summary.py
130 3. example_judge.py
131 4. graph_kuzu_from_text.py
132 5. graph_kuzu_property_example.py
133 6. langgraph_agent_test.py
134 7. ollama_tools_examples.py
135 8. tool_anti_hallucination.py
136 9. tool_file_contents.py
137 10. tool_file_dir.py
138 11. tool_judge_results.py
139 12. tool_llm_eval.py
140 13. tool_summarize_text.py
141 14. smolagents_agent_test1.py
142 15. smolagents_test.py
143
144 These Python programs evaluate the performance of LLMs:
145 1. tool_anti_hallucination.py
146 2. tool_llm_eval.py
147 3. tool_summarize_text.py
This is a lot of debug output to list in a book but I want you, dear reader, to get a feeling for the output generated by tools becomes the data for an again to observe before determining the next step in a plan to process.
This output shows the execution of the example smolagent-based agent that analyzes Python files in a directory looking for Python files containing code to evaluate the output results of LLMs. The agent follows a systematic approach by first listing all files using the sa_list_directory tool, then using sa_summarize_directory to provide detailed analysis of the contents.
The agent successfully identified all Python programs in the directory and specifically highlighted three files that evaluate LLM performance: tool_anti_hallucination.py (which checks for false information generation), tool_llm_eval.py (for general LLM evaluation), and tool_summarize_text.py (which likely tests LLM summarization capabilities). The execution includes detailed step-by-step logging, showing input/output tokens and duration for each step, demonstrating the agent’s methodical approach to file analysis and classification.
Output from the Second example: “What are the files in the current directory? Describe the current directory“
In this section we look at another agent processing cycle. Again, pay attention to the output of tools, and whether the agent can observe tool output and make sense of it (often the agent can’t!)
It is fairly normal for tools to fail with errors and it is important that agents can observe a failure and move on to try something else.
1 python smolagents_agent_test1.py
2 ╭────────────────────────────────── New run ───────────────────────────────────╮
3 │ │
4 │ What are the files in the current directory? Describe the current directory │
5 │ │
6 ╰─ LiteLLMModel - ollama_chat/llama3.2:latest ─────────────────────────────────╯
7 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
8 ╭──────────────────────────────────────────────────────────────────────────────╮
9 │ Calling tool: 'sa_list_directory' with arguments: {'list_dots': True} │
10 ╰──────────────────────────────────────────────────────────────────────────────╯
11 ('Contents of current directory: [Makefile, README.md, __pycache__, data, '
12 'example_chain_read_summary.py, example_chain_web_summary.py, '
13 'example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py, '
14 'langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt, '
15 'short_programs, smolagents_agent_test1.py, smolagents_test.py, '
16 'smolagents_tools.py, templates, tool_anti_hallucination.py, '
17 'tool_file_contents.py, tool_file_dir.py, tool_judge_results.py, '
18 'tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, '
19 'tool_web_search.py, venv]')
20 Observations: Contents of current directory: [Makefile, README.md, __pycache__,
21 data, example_chain_read_summary.py, example_chain_web_summary.py,
22 example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py,
23 langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt,
24 short_programs, smolagents_agent_test1.py, smolagents_test.py,
25 smolagents_tools.py, templates, tool_anti_hallucination.py,
26 tool_file_contents.py, tool_file_dir.py, tool_judge_results.py,
27 tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, tool_web_search.py,
28 venv]
29 [Step 0: Duration 3.06 seconds| Input tokens: 1,332 | Output tokens: 19]
30 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
31 ╭──────────────────────────────────────────────────────────────────────────────╮
32 │ Calling tool: 'sa_list_directory' with arguments: {'list_dots': True} │
33 ╰──────────────────────────────────────────────────────────────────────────────╯
34 ('Contents of current directory: [Makefile, README.md, __pycache__, data, '
35 'example_chain_read_summary.py, example_chain_web_summary.py, '
36 'example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py, '
37 'langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt, '
38 'short_programs, smolagents_agent_test1.py, smolagents_test.py, '
39 'smolagents_tools.py, templates, tool_anti_hallucination.py, '
40 'tool_file_contents.py, tool_file_dir.py, tool_judge_results.py, '
41 'tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, '
42 'tool_web_search.py, venv]')
43 Observations: Contents of current directory: [Makefile, README.md, __pycache__,
44 data, example_chain_read_summary.py, example_chain_web_summary.py,
45 example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py,
46 langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt,
47 short_programs, smolagents_agent_test1.py, smolagents_test.py,
48 smolagents_tools.py, templates, tool_anti_hallucination.py,
49 tool_file_contents.py, tool_file_dir.py, tool_judge_results.py,
50 tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, tool_web_search.py,
51 venv]
52 [Step 1: Duration 1.58 seconds| Input tokens: 2,900 | Output tokens: 38]
53 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
54 ╭──────────────────────────────────────────────────────────────────────────────╮
55 │ Calling tool: 'sa_summarize_directory' with arguments: {} │
56 ╰──────────────────────────────────────────────────────────────────────────────╯
57 lst='Contents of current directory: [Makefile, README.md, __pycache__, data, example_chain_read_summary.py, example_chain_web_summary.py, example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py, langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt, short_programs, smolagents_agent_test1.py, smolagents_test.py, smolagents_tools.py, templates, tool_anti_hallucination.py, tool_file_contents.py, tool_file_dir.py, tool_judge_results.py, tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, tool_web_search.py, venv]'
58 response.message.content="Based on the provided list of files in the current directory, here's a summary and an educated guess about the major purposes of each file:\n\n1. **Makefile**: A build script for automating compilation and other tasks.\n\n2. **README.md**: A document providing information about the project, its purpose, usage, and installation instructions.\n\n3. **__pycache__** (hidden directory): Cache files generated by Python's compiler to speed up importing modules.\n\n4. **data**: Directory containing data used for testing or training models.\n\n5. **example_chain_read_summary.py**, **example_chain_web_summary.py**: Example scripts demonstrating how to summarize text from chain-related input, possibly related to natural language processing (NLP) or machine learning (ML).\n\n6. **example_judge.py**: An example script for evaluating the performance of a model or algorithm.\n\n7. **graph_kuzu_from_text.py**, **graph_kuzu_property_example.py**: Scripts that manipulate graphs generated from text data using the Kuzu graph library, possibly used in NLP or ML applications.\n\n8. **langgraph_agent_test.py**: A test file for a language graph agent, which is likely an AI model designed to process and understand languages.\n\n9. **ollama_tools_examples.py**: An example script showcasing how to use Ollama, a tool for generating text data.\n\n10. **requirements.txt**: A list of dependencies required to run the project, including libraries and tools.\n\n11. **short_programs**: Directory containing short programs or scripts that demonstrate specific tasks or algorithms.\n\n12. **smolagents_agent_test1.py**, **smolagents_test.py**, **smolagents_tools.py**: Test files for a small agents framework, possibly an AI model designed to make decisions in complex environments.\n\n13. **templates**: A directory containing templates used for generating text or code in certain contexts.\n\n14. **tool_anti_hallucination.py**, **tool_file_contents.py**, **tool_file_dir.py**, **tool_judge_results.py**, **tool_llm_eval.py**, **tool_sqlite.py**, **tool_summarize_text.py**, **tool_web_search.py**: Various tool scripts that provide functionality for tasks like:\n - Anti-hallucination (removing fake data from generated text)\n - Evaluating file contents\n - File directory manipulation\n - Judging results\n - LLM (Large Language Model) evaluation\n - SQLite database interactions\n - Text summarization\n - Web search functionality\n\n15. **venv**: A virtual environment script used to create and manage a separate Python environment for the project.\n\nThese are educated guesses based on common naming conventions and directory structures in software development projects, particularly those related to AI, NLP, and machine learning."
59 Observations: Summary of directory:Based on the provided list of files in the
60 current directory, here's a summary and an educated guess about the major
61 purposes of each file:
62
63 1. **Makefile**: A build script for automating compilation and other tasks.
64
65 2. **README.md**: A document providing information about the project, its
66 purpose, usage, and installation instructions.
67
68 3. **__pycache__** (hidden directory): Cache files generated by Python's
69 compiler to speed up importing modules.
70
71 4. **data**: Directory containing data used for testing or training models.
72
73 5. **example_chain_read_summary.py**, **example_chain_web_summary.py**: Example
74 scripts demonstrating how to summarize text from chain-related input, possibly
75 related to natural language processing (NLP) or machine learning (ML).
76
77 6. **example_judge.py**: An example script for evaluating the performance of a
78 model or algorithm.
79
80 7. **graph_kuzu_from_text.py**, **graph_kuzu_property_example.py**: Scripts that
81 manipulate graphs generated from text data using the Kuzu graph library,
82 possibly used in NLP or ML applications.
83
84 8. **langgraph_agent_test.py**: A test file for a language graph agent, which is
85 likely an AI model designed to process and understand languages.
86
87 9. **ollama_tools_examples.py**: An example script showcasing how to use Ollama,
88 a tool for generating text data.
89
90 10. **requirements.txt**: A list of dependencies required to run the project,
91 including libraries and tools.
92
93 11. **short_programs**: Directory containing short programs or scripts that
94 demonstrate specific tasks or algorithms.
95
96 12. **smolagents_agent_test1.py**, **smolagents_test.py**,
97 **smolagents_tools.py**: Test files for a small agents framework, possibly an AI
98 model designed to make decisions in complex environments.
99
100 13. **templates**: A directory containing templates used for generating text or
101 code in certain contexts.
102
103 14. **tool_anti_hallucination.py**, **tool_file_contents.py**,
104 **tool_file_dir.py**, **tool_judge_results.py**, **tool_llm_eval.py**,
105 **tool_sqlite.py**, **tool_summarize_text.py**, **tool_web_search.py**: Various
106 tool scripts that provide functionality for tasks like:
107 - Anti-hallucination (removing fake data from generated text)
108 - Evaluating file contents
109 - File directory manipulation
110 - Judging results
111 - LLM (Large Language Model) evaluation
112 - SQLite database interactions
113 - Text summarization
114 - Web search functionality
115
116 15. **venv**: A virtual environment script used to create and manage a separate
117 Python environment for the project.
118
119 These are educated guesses based on common naming conventions and directory
120 structures in software development projects, particularly those related to AI,
121 NLP, and machine learning.
122 [Step 2: Duration 13.79 seconds| Input tokens: 4,706 | Output tokens: 54]
123
124 ...
125
126 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
127 ╭──────────────────────────────────────────────────────────────────────────────╮
128 │ Calling tool: 'sa_summarize_directory' with arguments: {} │
129 ╰──────────────────────────────────────────────────────────────────────────────╯
130 lst='Contents of current directory: [Makefile, README.md, __pycache__, data, example_chain_read_summary.py, example_chain_web_summary.py, example_judge.py, graph_kuzu_from_text.py, graph_kuzu_property_example.py, langgraph_agent_test.py, ollama_tools_examples.py, requirements.txt, short_programs, smolagents_agent_test1.py, smolagents_test.py, smolagents_tools.py, templates, tool_anti_hallucination.py, tool_file_contents.py, tool_file_dir.py, tool_judge_results.py, tool_llm_eval.py, tool_sqlite.py, tool_summarize_text.py, tool_web_search.py, venv]'
131 response.message.content="Based on the names and locations of the files in the current directory, here's a summary of their contents and a educated guess about their purposes:\n\n1. **Makefile**: A build script for automating tasks such as compiling or running code. It likely contains instructions on how to build and install the project.\n\n2. **README.md**: A Markdown document that serves as an introduction or guide for users of the project. It may include information on how to use the tools, dependencies required, and contributing to the project.\n\n3. **__pycache__**: An empty directory that contains compiled Python files (`.cpyc` and `.pyo`) generated by the PyInstaller build process for a Python application.\n\n4. **data**: A directory containing data used for testing or training purposes. It might include CSV, JSON, or other formats of datasets.\n\n5. **example_chain_read_summary.py** and **example_chain_web_summary.py**: Example scripts demonstrating how to use tools related to text summarization, possibly for natural language processing (NLP) tasks.\n\n6. **example_judge.py**: An example script that likely demonstrates the usage of a judging tool or an evaluation framework for the project.\n\n7. **graph_kuzu_from_text.py** and **graph_kuzu_property_example.py**: Scripts related to graph-based tools, possibly using Kuzu, a library for graph algorithms. These scripts might illustrate how to work with graphs in Python.\n\n8. **langgraph_agent_test.py**: A script that tests the functionality of a language graph agent.\n\n9. **ollama_tools_examples.py**: An example script demonstrating the usage of OLLAMA tools ( likely Open-source Language Model-based Agent).\n\n10. **requirements.txt**: A text file listing the dependencies required to run the project, such as Python packages or other software libraries.\n\n11. **short_programs**: A directory containing short programs for demonstration purposes.\n\n12. **smolagents_agent_test1.py** and **smolagents_test.py**: Scripts related to testing SmoLA (Small Model-based Language Agent), a library that allows the use of small models in language agents.\n\n13. **smolagents_tools.py**: An example script demonstrating the usage of SmoLA tools.\n\n14. **templates**: A directory containing template files used for generating documentation or other text output.\n\n15. **tool_anti_hallucination.py**, **tool_file_contents.py**, **tool_file_dir.py**, **tool_judge_results.py**, and **tool_llm_eval.py**: Scripts related to various tools, possibly used for data analysis, model evaluation, or language understanding tasks.\n\n16. **tool_sqlite.py** and **tool_summarize_text.py**: Scripts that interface with SQLite databases or are used for text summarization.\n\n17. **tool_web_search.py** and **tool_web_search.py** ( likely a duplicate): These scripts might be related to web search-related tools, possibly interfacing with APIs or web scraping techniques.\n\n18. **venv**: A directory containing the virtual environment configuration files created by `python -m venv` command, allowing the project to isolate its dependencies from the system Python environment.\n\nNote: The exact purposes of some files may vary depending on the specific implementation and requirements of the project. This summary is based on common file name conventions and their typical associations with certain tasks or tools in the field of NLP and agent-based systems."
132 Observations: Summary of directory:Based on the names and locations of the files
133 in the current directory, here's a summary of their contents and a educated
134 guess about their purposes:
135
136 1. **Makefile**: A build script for automating tasks such as compiling or
137 running code. It likely contains instructions on how to build and install the
138 project.
139
140 2. **README.md**: A Markdown document that serves as an introduction or guide
141 for users of the project. It may include information on how to use the tools,
142 dependencies required, and contributing to the project.
143
144 3. **__pycache__**: An empty directory that contains compiled Python files
145 (`.cpyc` and `.pyo`) generated by the PyInstaller build process for a Python
146 application.
147
148 4. **data**: A directory containing data used for testing or training purposes.
149 It might include CSV, JSON, or other formats of datasets.
150
151 5. **example_chain_read_summary.py** and **example_chain_web_summary.py**:
152 Example scripts demonstrating how to use tools related to text summarization,
153 possibly for natural language processing (NLP) tasks.
154
155 6. **example_judge.py**: An example script that likely demonstrates the usage of
156 a judging tool or an evaluation framework for the project.
157
158 7. **graph_kuzu_from_text.py** and **graph_kuzu_property_example.py**: Scripts
159 related to graph-based tools, possibly using Kuzu, a library for graph
160 algorithms. These scripts might illustrate how to work with graphs in Python.
161
162 8. **langgraph_agent_test.py**: A script that tests the functionality of a
163 language graph agent.
164
165 9. **ollama_tools_examples.py**: An example script demonstrating the usage of
166 OLLAMA tools ( likely Open-source Language Model-based Agent).
167
168 10. **requirements.txt**: A text file listing the dependencies required to run
169 the project, such as Python packages or other software libraries.
170
171 11. **short_programs**: A directory containing short programs for demonstration
172 purposes.
173
174 12. **smolagents_agent_test1.py** and **smolagents_test.py**: Scripts related to
175 testing SmoLA (Small Model-based Language Agent), a library that allows the use
176 of small models in language agents.
177
178 13. **smolagents_tools.py**: An example script demonstrating the usage of SmoLA
179 tools.
180
181 14. **templates**: A directory containing template files used for generating
182 documentation or other text output.
183
184 15. **tool_anti_hallucination.py**, **tool_file_contents.py**,
185 **tool_file_dir.py**, **tool_judge_results.py**, and **tool_llm_eval.py**:
186 Scripts related to various tools, possibly used for data analysis, model
187 evaluation, or language understanding tasks.
188
189 16. **tool_sqlite.py** and **tool_summarize_text.py**: Scripts that interface
190 with SQLite databases or are used for text summarization.
191
192 17. **tool_web_search.py** and **tool_web_search.py** ( likely a duplicate):
193 These scripts might be related to web search-related tools, possibly interfacing
194 with APIs or web scraping techniques.
195
196 18. **venv**: A directory containing the virtual environment configuration files
197 created by `python -m venv` command, allowing the project to isolate its
198 dependencies from the system Python environment.
199
200 Note: The exact purposes of some files may vary depending on the specific
201 implementation and requirements of the project. This summary is based on common
202 file name conventions and their typical associations with certain tasks or tools
203 in the field of NLP and agent-based systems.
204 [Step 5: Duration 18.89 seconds| Input tokens: 10,726 | Output tokens: 102]
205 Reached max iterations.
206 Final answer: The current directory appears to be a repository for various
207 projects related to natural language processing (NLP) and agent-based systems.
208 The directory contains a mix of files and subdirectories that suggest it is used
209 for building, testing, and demonstrating NLP-related tools and applications.
210
211 Here are some key observations about the files in the current directory:
212
213 1. **Makefile**: A build script for automating tasks such as compiling or
214 running code.
215 2. **README.md**: A Markdown document serving as an introduction to the project.
216 3. **__pycache__**: An empty directory containing compiled Python files
217 generated by the PyInstaller build process.
218
219 4. **data**: A directory containing data used for testing or training purposes,
220 including CSV, JSON, and other formats of datasets.
221
222 5. **Example scripts**: Various example scripts demonstrating how to use tools
223 related to text summarization, judging, graph-based algorithms, language graph
224 agents, and more.
225
226 6. **Requirements file**: A text file listing the dependencies required to run
227 the project.
228
229 7. **Short programs**: A directory containing short programs for demonstration
230 purposes.
231
232 8. **Testing scripts**: Scripts related to testing various NLP-related tools and
233 libraries, including SmoLA and OLLAMA.
234
235 9. **Tool scripts**: Various scripts related to data analysis, model evaluation,
236 language understanding, web search, and other tasks.
237
238 10. **Virtual environment**: A directory containing the virtual environment
239 configuration files created by `python -m venv` command.
240
241 The current directory seems to be a collection of example projects, tools, and
242 scripts for working with NLP-related technologies. It is likely used as a
243 demonstration or testing environment for exploring new ideas and techniques.
244 [Step 6: Duration 0.00 seconds| Input tokens: 12,332 | Output tokens: 442]
245 The current directory appears to be a repository for various projects related to natural language processing (NLP) and agent-based systems. The directory contains a mix of files and subdirectories that suggest it is used for building, testing, and demonstrating NLP-related tools and applications.
246
247 Here are some key observations about the files in the current directory:
248
249 1. **Makefile**: A build script for automating tasks such as compiling or running code.
250 2. **README.md**: A Markdown document serving as an introduction to the project.
251 3. **__pycache__**: An empty directory containing compiled Python files generated by the PyInstaller build process.
252
253 4. **data**: A directory containing data used for testing or training purposes, including CSV, JSON, and other formats of datasets.
254
255 5. **Example scripts**: Various example scripts demonstrating how to use tools related to text summarization, judging, graph-based algorithms, language graph agents, and more.
256
257 6. **Requirements file**: A text file listing the dependencies required to run the project.
258
259 7. **Short programs**: A directory containing short programs for demonstration purposes.
260
261 8. **Testing scripts**: Scripts related to testing various NLP-related tools and libraries, including SmoLA and OLLAMA.
262
263 9. **Tool scripts**: Various scripts related to data analysis, model evaluation, language understanding, web search, and other tasks.
264
265 10. **Virtual environment**: A directory containing the virtual environment configuration files created by `python -m venv` command.
266
267 The current directory seems to be a collection of example projects, tools, and scripts for working with NLP-related technologies. It is likely used as a demonstration or testing environment for exploring new ideas and techniques.
This output shows the agent performing a directory analysis using multiple tool calls, primarily utilizing sa_list_directory and sa_summarize_directory to examine the contents of the current working directory. The analysis revealed a Python-based project focused on natural language processing (NLP) and agent-based systems, containing various components including example scripts, testing files, and utility tools. The agent executed multiple iterations to gather and process information about the directory structure, with each step taking between 1.58 to 18.89 seconds to complete.
The final analysis identified key project components including a Makefile for build automation, example scripts demonstrating text summarization and graph-based algorithms, testing scripts for smolagent (Small Model-based Language Agent) and OLLAMA tools, and various utility scripts for tasks like anti-hallucination, database interactions, and web searching. The directory structure suggests this is a development and testing environment for NLP-related technologies, complete with its own virtual environment and dependency management through requirements.txt. The agent’s analysis provided detailed insights into the purpose and organization of the codebase while maintaining a focus on its NLP and agent-based systems orientation.
Output from Third Example: “Read the text in the file ‘data/economics.txt’ file and then summarize this text.”
1 python smolagents_agent_test1.py
2 ╭────────────────────────────────── New run ───────────────────────────────────╮
3 │ │
4 │ Read the text in the file 'data/economics.txt' file and then summarize this │
5 │ text. │
6 │ │
7 ╰─ LiteLLMModel - ollama_chat/llama3.2:latest ─────────────────────────────────╯
8 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
9 ╭──────────────────────────────────────────────────────────────────────────────╮
10 │ Calling tool: 'sa_read_file_contents' with arguments: {'file_path': │
11 │ 'data/economics.txt'} │
12 ╰──────────────────────────────────────────────────────────────────────────────╯
13 Observations: Contents of file 'data/economics.txt' is:
14 The Austrian School (also known as the Vienna School or the Psychological School
15 ) is a Schools of economic thought|school of economic thought that emphasizes
16 the spontaneous organizing power of the price mechanism. Austrians hold that the
17 complexity of subjective human choices makes mathematical modelling of the
18 evolving market extremely difficult (or Undecidable and advocate a "laissez
19 faire" approach to the economy. Austrian School economists advocate the strict
20 enforcement of voluntary contractual agreements between economic agents, and
21 hold that commercial transactions should be subject to the smallest possible
22 imposition of forces they consider to be (in particular the smallest possible
23 amount of government intervention). The Austrian School derives its name from
24 its predominantly Austrian founders and early supporters, including Carl Menger,
25 Eugen von Böhm-Bawerk and Ludwig von Mises.
26
27 Economics is the social science that analyzes the production, distribution, and
28 consumption of goods and services. Political economy was the earlier name for
29 the subject, but economists in the late 19th century suggested "economics" as a
30 shorter term for "economic science" that also avoided a narrow
31 political-interest connotation and as similar in form to "mathematics",
32 "ethics", and so forth.[2]
33
34 A focus of the subject is how economic agents behave or interact and how
35 economies work. Consistent with this, a primary textbook distinction is between
36 microeconomics and macroeconomics. Microeconomics examines the behavior of basic
37 elements in the economy, including individual agents (such as households and
38 firms or as buyers and sellers) and markets, and their interactions.
39 Macroeconomics analyzes the entire economy and issues affecting it, including
40 unemployment, inflation, economic growth, and monetary and fiscal policy.
41
42 The professionalization of economics, reflected in the growth of
43 graduate programs on the subject, has been described as "the main change in
44 economics since around 1900".[93] Most major universities and many colleges have
45 a major, school, or department in which academic degrees are awarded in the
46 subject, whether in the liberal arts, business, or for professional study; see
47 Master of Economics.
48
49
50 Economics is the social science that studies the behavior of
51 individuals, households, and organizations (called economic actors, players, or
52 agents), when they manage or use scarce resources, which have alternative uses,
53 to achieve desired ends. Agents are assumed to act rationally, have multiple
54 desirable ends in sight, limited resources to obtain these ends, a set of stable
55 preferences, a definite overall guiding objective, and the capability of making
56 a choice. There exists an economic problem, subject to study by economic
57 science, when a decision (choice) is made by one or more resource-controlling
58 players to attain the best possible outcome under bounded rational conditions.
59 In other words, resource-controlling agents maximize value subject to the
60 constraints imposed by the information the agents have, their cognitive
61 limitations, and the finite amount of time they have to make and execute a
62 decision. Economic science centers on the activities of the economic agents that
63 comprise society.[1] They are the focus of economic analysis.[2]
64
65 The traditional concern of economic analysis is to gain an
66 understanding of the processes that govern the production, distribution and
67 consumption of goods and services in an exchange economy.[3] An approach to
68 understanding these processes, through the study of agent behavior under
69 scarcity, may go as s:
70
71 An interesting Economist is Pauli Blendergast who teaches at the University of
72 Krampton Ohio and is famouse for saying economics is bullshit.
73 [Step 0: Duration 3.51 seconds| Input tokens: 1,338 | Output tokens: 39]
74 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
75 ╭──────────────────────────────────────────────────────────────────────────────╮
76 │ Calling tool: 'sa_read_file_contents' with arguments: {'file_path': │
77 │ 'data/economics.txt'} │
78 ╰──────────────────────────────────────────────────────────────────────────────╯
79 Observations: Contents of file 'data/economics.txt' is:
80 The Austrian School (also known as the Vienna School or the Psychological School
81 ) is a Schools of economic thought|school of economic thought that emphasizes
82 the spontaneous organizing power of the price mechanism. Austrians hold that the
83 complexity of subjective human choices makes mathematical modelling of the
84 evolving market extremely difficult (or Undecidable and advocate a "laissez
85 faire" approach to the economy. Austrian School economists advocate the strict
86 enforcement of voluntary contractual agreements between economic agents, and
87 hold that commercial transactions should be subject to the smallest possible
88 imposition of forces they consider to be (in particular the smallest possible
89 amount of government intervention). The Austrian School derives its name from
90 its predominantly Austrian founders and early supporters, including Carl Menger,
91 Eugen von Böhm-Bawerk and Ludwig von Mises.
92
93 Economics is the social science that analyzes the production, distribution, and
94 consumption of goods and services. Political economy was the earlier name for
95 the subject, but economists in the late 19th century suggested "economics" as a
96 shorter term for "economic science" that also avoided a narrow
97 political-interest connotation and as similar in form to "mathematics",
98 "ethics", and so forth.[2]
99
100 A focus of the subject is how economic agents behave or interact and how
101 economies work. Consistent with this, a primary textbook distinction is between
102 microeconomics and macroeconomics. Microeconomics examines the behavior of basic
103 elements in the economy, including individual agents (such as households and
104 firms or as buyers and sellers) and markets, and their interactions.
105 Macroeconomics analyzes the entire economy and issues affecting it, including
106 unemployment, inflation, economic growth, and monetary and fiscal policy.
107
108 The professionalization of economics, reflected in the growth of
109 graduate programs on the subject, has been described as "the main change in
110 economics since around 1900".[93] Most major universities and many colleges have
111 a major, school, or department in which academic degrees are awarded in the
112 subject, whether in the liberal arts, business, or for professional study; see
113 Master of Economics.
114
115
116 Economics is the social science that studies the behavior of
117 individuals, households, and organizations (called economic actors, players, or
118 agents), when they manage or use scarce resources, which have alternative uses,
119 to achieve desired ends. Agents are assumed to act rationally, have multiple
120 desirable ends in sight, limited resources to obtain these ends, a set of stable
121 preferences, a definite overall guiding objective, and the capability of making
122 a choice. There exists an economic problem, subject to study by economic
123 science, when a decision (choice) is made by one or more resource-controlling
124 players to attain the best possible outcome under bounded rational conditions.
125 In other words, resource-controlling agents maximize value subject to the
126 constraints imposed by the information the agents have, their cognitive
127 limitations, and the finite amount of time they have to make and execute a
128 decision. Economic science centers on the activities of the economic agents that
129 comprise society.[1] They are the focus of economic analysis.[2]
130
131 The traditional concern of economic analysis is to gain an
132 understanding of the processes that govern the production, distribution and
133 consumption of goods and services in an exchange economy.[3] An approach to
134 understanding these processes, through the study of agent behavior under
135 scarcity, may go as s:
136
137 An interesting Economist is Pauli Blendergast who teaches at the University of
138 Krampton Ohio and is famouse for saying economics is bullshit.
139 [Step 1: Duration 4.33 seconds| Input tokens: 3,386 | Output tokens: 59]
140
141 ...
142
143 [Step 6: Duration 0.00 seconds| Input tokens: 13,277 | Output tokens: 307]
144 Here's a summary of the text:
145
146 The Austrian School of economic thought emphasizes the power of the price mechanism and advocates for minimal government intervention. It focuses on individual agents making rational choices under scarcity to achieve desired ends.
147
148 Economics is the social science that analyzes production, distribution, and consumption of goods and services. It studies how economic agents interact and behave in markets. The subject has become more professionalized over time, with many universities offering degrees in economics.
149
150 The traditional concern of economic analysis is to understand the processes governing the production, distribution, and consumption of goods and services in an exchange economy. This involves studying agent behavior under scarcity and maximizing value subject to constraints.
151
152 Interestingly, some economists may view economics as a complex and sometimes useless field, like Pauli Blendergast, who famously declared that "economics is bullshit."
This output shows a sequence of steps where the agent repeatedly calls directory listing and summarization tools to understand the contents of a Python project directory. The agent uses tools like sa_list_directory and sa_summarize_directory to gather information, with each step building on previous observations to form a more complete understanding of the codebase.
Through multiple iterations, the agent analyzes a directory containing various Python files related to NLP and agent-based systems. The files include examples of text summarization, graph processing with Kuzu, language model evaluation tools, and various utility scripts. The agent ultimately produces a comprehensive summary categorizing the files into groups like build scripts, example code, testing scripts, and tool implementations, while noting the project appears to be focused on demonstrating and testing NLP-related technologies. This output log shows the agent taking about 75 seconds total across 6 steps to complete its analysis, with each step consuming progressively more tokens as it builds its understanding.
Agents Wrap Up
There are several options for LLM agent frameworks. I especially like smolagents because it works fairly well with smaller models run with Ollama. I have experimented with other agent frameworks that work well with Claude, GPT-4o, etc., but fail more frequently when used with smaller LLMs.
Using AG2 Open-Source AgentOS LLM-Based Agent Framework for Generating and Executing Python Code
AG2 Open-Source AgentOS is a fork of Microsoft’s Autogen agent framework by many or the original creators. While I am a fan of Autogen, I find AG2 simpler to work with when using local models on my laptop and we will use the forked library here.
The AG2 agent framework is an excellent tool for creating multi-agent applications, though it was originally designed to work primarily with OpenAI’s models. In this chapter, we demonstrate how to use its capabilities with a local Ollama model (qwen2.5:14b.) We use a modified version of an AG2 example that combines Python and Matplotlib with AG2, showcasing how local deployment can be achieved without sacrificing the robust tool calling features the framework offers.
This approach provides several benefits. By leveraging a local model via Ollama, developers reduce dependency on external APIs, enhance privacy and security, and potentially lower operational costs while retaining the power of AG2’s code generation and execution. The example illustrates how an assistant agent generates the necessary code to perform tasks, and a user proxy agent executes that code to produce dynamic visualizations, such as plotting stock price changes. This decoupling of code generation from execution not only enhances reliability but also allows for greater customization of the execution environment.
Example Implementation
I experimented with several local models using Ollama with mediocre results but the larger qwen2.5:14b model works very well. If you are running on a Mac you will need an Apple Silicon chip with 16G of memory to run this model.
1 from autogen import AssistantAgent, UserProxyAgent
2
3 # Requirements:
4 # pip install ag2 ollama fix_busted_json yfinance matplotlib
5
6 config_list = [
7 {
8 "model": "qwen2.5:14b", # Choose a model that supports tool calling
9 "api_type": "ollama", # Specify Ollama as the API type
10 "client_host": "http://localhost:11434", # local Ollama server
11 "api_key": "fakekey",
12 "native_tool_calls": True # Enable native tool calling
13 }
14 ]
15
16 # Create the AssistantAgent using the local model config
17 assistant = AssistantAgent("assistant",
18 llm_config={"config_list": config_list})
19
20 # Create the UserProxyAgent; adjust code_execution_config as needed.
21 user_proxy = UserProxyAgent(
22 "user_proxy",
23 code_execution_config={"work_dir": "coding", "use_docker": False}
24 )
25
26 # Initiate an automated chat between the agents.
27 user_proxy.initiate_chat(
28 assistant,
29 message="Plot a chart of NVDA and TESLA stock price change YTD."
30 )
This code sets up a multi-agent workflow using AG2 by configuring an assistant agent and a user proxy agent. First, it defines a configuration for a local Ollama-based model (here, “qwen2.5:14b”) that supports native tool calling. The configuration specifies details such as the API type (“ollama”), the local server URL (“http://localhost:11434”), a placeholder API key, and an option to enable native tool calls. This configuration is then passed to instantiate the AssistantAgent, which uses it to generate responses based on the local LLM.
This example uses a UserProxyAgent configured for Python code execution with a designated working directory (“coding”) and Docker disabled. Finally, the user proxy agent initiates an automated chat with the assistant by sending a message requesting the plotting of a chart for NVDA and TESLA stock price changes year-to-date. This setup demonstrates a simple, automated multi-agent interaction where the assistant generates responses (potentially including code) and the user proxy executes those responses to fulfill the requested task. You will either want to remove the generated code directory coding after running this example of add coding to your .gitignore file.
Example Output

Here is a partial listing of the output:
1 $ python autogen_python_example.py
2
3 user_proxy (to assistant):
4
5 Plot a chart of NVDA and TESLA stock price change YTD.
6
7 -------------------------------------------------------------
8 assistant (to user_proxy):
9
10 To plot a chart of NVDA (NVIDIA) and Tesla's stock price changes year-to-date, we will need to fetch their historical data from an API or a financial data source such as Yahoo Finance.
11
12 Let's use Python with the `yfinance` library to get the data and then plot it using `matplotlib`.
13
14 Here is a step-by-step plan:
15
16 1. **Install Required Libraries**: If not already installed, install `yfinance` for fetching stock data and `pandas` & `matplotlib` for processing and plotting the data.
17 2. **Fetch Stock Data**: Use `yfinance` to get the year-to-date historical data for NVDA and TSLA stocks.
18 3. **Plot Data**: Plot the closing prices of both stocks on a chart.
19
20 Let's start by fetching and plotting the stock price data with Python code:
21
22 ``python
23 # filename: plot_stock_prices.py
24
25 import yfinance as yf
26 import matplotlib.pyplot as plt
27 import pandas as pd
28
29 # Fetch year-to-date historical data for NVDA and TSLA
30 stocks = ['NVDA', 'TSLA']
31 data = yf.download(stocks, start='2023-01-01')['Close']
32
33 # Plot the closing prices of both stocks
34 plt.figure(figsize=(14,7))
35 for stock in stocks:
36 plt.plot(data.index, data[stock], label=stock)
37
38 plt.title('Year-to-date Stock Prices for NVDA and TSLA')
39 plt.xlabel('Date')
40 plt.ylabel('Stock Price (USD)')
41 plt.legend()
42 plt.grid(True)
43 plt.show()
44 ``
45
46 You can execute the above Python script to generate the plot. Please make sure you have `yfinance`, `matplotlib`, and `pandas` installed in your environment. If not, install them by running:
47
48 ``sh
49 pip install yfinance matplotlib pandas
50 ``
51
52 After executing the code, you should see a chart showing the year-to-date closing prices for both NVDA and TSLA stocks.
53
54 Once you have executed this script, please inform me of any issues or if everything is plotted correctly.
55
56 -------------------------------------------------------------
57 Replying as user_proxy. Provide feedback to assistant. Press enter to skip and use auto-reply, or type 'exit' to end the conversation:
58
59 >>>>>>>> NO HUMAN INPUT RECEIVED.
60 >>>>>>>> USING AUTO REPLY...
61 >>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
Wrap Up for Using AG2’s Agent Framework for Generating and Executing Python Code
Here we experimented with using Ollama and a local model to use the powerful AG2 agent framework.
The AG2 framework for Python tool calling agents streamlines the development of multi-agent applications by enabling seamless interaction between an assistant agent and a user proxy agent. In the example, we configured a local LLM (using a model like “qwen2.5:14b” via Ollama) and instantiated an AssistantAgent to generate responses and code, while a UserProxyAgent was set up to execute the generated code. This automated conversation allowed the agents to collaborate on a task—plotting a chart for NVDA and TESLA stock price changes year-to-date—demonstrating how AG2 bridges code generation and execution in an autonomous workflow.
This example highlights AG2’s ability to leverage native tool calling within a Python environment, reducing the need for manual intervention during code execution and debugging. By decoupling the generation of task-specific code from its execution, developers can build systems that are both flexible and scalable. The assistant agent focuses on planning and generating code, while the user proxy agent reliably executes that code, creating an effective feedback loop for refining results. This pattern not only simplifies complex workflows but also provides a foundation for robust, error-resilient applications.
To further explore the potential of AG2, you dear reader might experiment with integrating additional tools—such as web scraping modules, database connectors, or advanced visualization libraries—to expand the capabilities of their agents. Another interesting avenue is to adjust the configuration parameters: try different LLM models, enable Docker-based code execution, or incorporate human-in-the-loop feedback to refine responses. Additionally, extending the workflow to include more specialized agents (e.g., a dedicated CodeExecutorAgent or a DebuggingAgent) can provide insights into multi-agent orchestration and the scalability of autonomous systems. These experiments will help readers understand the full versatility of AG2 in creating dynamic, multi-step applications.
Building Agents with Ollama and the Agno Agent Library
The synergy between locally hosted large language models (LLMs) and modern agentic frameworks provides a powerful paradigm for creating sophisticated AI assistants. This chapter explores how to leverage Ollama in conjunction with the Agno library, a lightweight framework for building autonomous agents. By equipping a local LLM with a clear persona, detailed instructions, and a set of external tools, we can construct agents capable of complex, multi-step reasoning and task execution, all while ensuring data privacy and control. The two examples developed here demonstrate this approach by building a dedicated research agent.
A Personal Research Agent
This Python script demonstrates the creation of a sophisticated, autonomous research agent by leveraging the agno library. The agent is designed to emulate an elite investigative journalist from the New York Times, tasked with conducting in-depth research on any given topic. It integrates multiple tools to achieve this: DuckDuckGoTools for performing comprehensive web searches across numerous sources and Newspaper4kTools for extracting and parsing content from articles. The core of the agent is an Ollama language model, which acts as the central reasoning engine. The agent’s behavior is meticulously defined through a detailed prompt engineering strategy, including a rich description to establish its expert persona, a multi-phase set of instructions that guide it through research, analysis, writing, and quality control, and a structured expected_output template to ensure the final report is formatted as a professional, NYT-style article. This example showcases how modern AI frameworks can orchestrate language models and external tools to automate complex, multi-step knowledge work.
1 ## Note: This example was taken from the Agno documentation and converted to run
2 ## with Ollama and with other modifications.
3 ## URI for original source code:
4 ## https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/research_agent.py
5 ## that is licensed under the MPL-2.0 license.
6
7 """🔍 Research Agent - Your AI Investigative Journalist!
8
9 This example shows how to create a sophisticated research agent that combines
10 web search capabilities with professional journalistic writing skills. The agent performs
11 comprehensive research using multiple sources, fact-checks information, and delivers
12 well-structured, NYT-style articles on any topic.
13
14 Key capabilities:
15 - Advanced web search across multiple sources
16 - Content extraction and analysis
17 - Cross-reference verification
18 - Professional journalistic writing
19 - Balanced and objective reporting
20
21 Example prompts to try:
22 - "Analyze the impact of AI on healthcare delivery and patient outcomes"
23 - "Report on the latest breakthroughs in quantum computing"
24 - "Investigate the global transition to renewable energy sources"
25 - "Explore the evolution of cybersecurity threats and defenses"
26 - "Research the development of autonomous vehicle technology"
27
28 Dependencies: `pip install ollama duckduckgo-search newspaper4k lxml_html_clean agno`
29 """
30
31 from textwrap import dedent
32
33 from agno.agent import Agent
34 from agno.models.ollama import Ollama
35 from agno.tools.duckduckgo import DuckDuckGoTools
36 from agno.tools.newspaper4k import Newspaper4kTools
37
38 # Initialize the research agent with advanced journalistic capabilities
39 research_agent = Agent(
40 model=Ollama(id="qwen3:0.6b"),
41 tools=[DuckDuckGoTools(), Newspaper4kTools()],
42 description=dedent("""\
43 You are an elite investigative journalist with decades of experience at the New York Times.
44 Your expertise encompasses: 📰
45
46 - Deep investigative research and analysis
47 - Meticulous fact-checking and source verification
48 - Compelling narrative construction
49 - Data-driven reporting and visualization
50 - Expert interview synthesis
51 - Trend analysis and future predictions
52 - Complex topic simplification
53 - Ethical journalism practices
54 - Balanced perspective presentation
55 - Global context integration\
56 """),
57 instructions=dedent("""\
58 1. Research Phase 🔍
59 - Search for 10+ authoritative sources on the topic
60 - Prioritize recent publications and expert opinions
61 - Identify key stakeholders and perspectives
62
63 2. Analysis Phase 📊
64 - Extract and verify critical information
65 - Cross-reference facts across multiple sources
66 - Identify emerging patterns and trends
67 - Evaluate conflicting viewpoints
68
69 3. Writing Phase ✍️
70 - Craft an attention-grabbing headline
71 - Structure content in NYT style
72 - Include relevant quotes and statistics
73 - Maintain objectivity and balance
74 - Explain complex concepts clearly
75
76 4. Quality Control ✓
77 - Verify all facts and attributions
78 - Ensure narrative flow and readability
79 - Add context where necessary
80 - Include future implications
81 """),
82 expected_output=dedent("""\
83 # {Compelling Headline} 📰
84
85 ## Executive Summary
86 {Concise overview of key findings and significance}
87
88 ## Background & Context
89 {Historical context and importance}
90 {Current landscape overview}
91
92 ## Key Findings
93 {Main discoveries and analysis}
94 {Expert insights and quotes}
95 {Statistical evidence}
96
97 ## Impact Analysis
98 {Current implications}
99 {Stakeholder perspectives}
100 {Industry/societal effects}
101
102 ## Future Outlook
103 {Emerging trends}
104 {Expert predictions}
105 {Potential challenges and opportunities}
106
107 ## Expert Insights
108 {Notable quotes and analysis from industry leaders}
109 {Contrasting viewpoints}
110
111 ## Sources & Methodology
112 {List of primary sources with key contributions}
113 {Research methodology overview}
114
115 ---
116 Research conducted by AI Investigative Journalist
117 New York Times Style Report
118 Published: {current_date}
119 Last Updated: {current_time}\
120 """),
121 markdown=True,
122 show_tool_calls=True,
123 add_datetime_to_instructions=True,
124 )
125
126 # Example usage with detailed research request
127 if __name__ == "__main__":
128 research_agent.print_response(
129 "Analyze the current state and future implications of artificial intelligence regulation worldwide",
130 stream=True,
131 )
This code, derived from the Agno documentation examples, effectively encapsulates the modern paradigm of agent-based AI systems by composing a large language model with specialized tools. At its core, the script instantiates an Agent from the agno library, configuring it with three key components: the Ollama model (qwen3:0.6b) as its cognitive engine, and a list of tools (DuckDuckGoTools, Newspaper4kTools) that grant it the ability to perceive and act upon the digital world. The true power of this implementation lies in its advanced prompt engineering. The description, instructions, and expected_output arguments are not mere suggestions; they form a comprehensive “system prompt” that imbues the agent with a specific persona, a detailed standard operating procedure, and a strict output schema. This structured approach transforms a general-purpose language model into a specialized worker capable of a specific, high-level task.
This example illustrates a significant evolution from simple conversational AI to goal-oriented, autonomous agents. By explicitly defining a workflow from initial, broad research to analysis, synthesis, and structured writing—the code mimics the methodology of a human expert. It highlights how libraries like agno simplify the orchestration of these complex interactions, allowing developers to focus on defining the agent’s strategy rather than the low-level API calls. Furthermore, the use of open-source components like Ollama demonstrates the increasing accessibility of building powerful, customized AI agents that can automate sophisticated information-processing tasks, effectively serving as AI-powered investigative journalists on demand.
Here is sample output (edited for brevity, removing Unicode characters for bounding boxes in the output):
1 Message
2
3 Analyze the current state and future implications of artificial intelligence regulation worldwide
4
5 Response
6
7 <think> Okay, the user wants an analysis of current AI regulation and future implications. Let me start by understanding
8 the key points they need. First, the current state should cover where AI regulations are, maybe countries like the EU, US,
9 or others. Then, future implications—like tech industries or governments' approaches. I need to make sure to mention
10 expert opinions and sources.
11
12 Wait, the user mentioned using specific tools. Let me check the available functions. There's duckduckgo_search,
13 duckduckgo_news, and read_article. Since the user didn't specify an article, maybe I should search for AI regulation and
14 then use the tools. But how? The tools are functions to search, not articles. Hmm. Maybe the user expects me to use the
15 search functions first to gather sources, then present findings. But the instructions say to use the provided tools. Wait,
16 the problem says to use the functions if needed. So perhaps I should first search for sources on AI regulations, then use
17 the articles if found. But the user's query is just an analysis, not requiring articles. Maybe the answer should just be
18 the research phase followed by the analysis. Let me structure the response with the research phase and then the analysis,
19 ensuring all key points are covered and sources cited properly. </think>
20
21 AI Regulation Trends and Future Outlook
22
23 Executive Summary
24
25 Global AI regulation is evolving, with the EU's Digital Strategy, the US's AI Act, and China's AI Act setting precedents.
26 Future implications include global competition, ethical frameworks, and AI governance policies.
27
28
29 Background & Context
30
31 AI regulation has become a focal point of international discourse. The EU's AI Strategy (2022) and the US's AI Act (2023)
32 represent key regulatory frameworks. Meanwhile, China's AI Act (2021) emphasizes transparency, while the Asia-Pacific
33 Economic Cooperation (APEC) has initiated AI governance dialogues. This context highlights both cooperation and
34 conflicting regulatory priorities.
35
36
37 Key Findings
38
39 Current State
40
41 • EU Regulation: AI ethics frameworks (e.g., AI Act 2023) prioritize transparency and human oversight.
42 • US Regulation: The AI Act mandates ethical AI development, with industry self-regulation.
43 • China Regulation: AI governance is guided by national security and innovation policies.
44
45 Expert Insights
46
47 • Dr. Lisa Chen (AI Ethics Lab, MIT): "Regulation must balance innovation with human rights."
48 • Dr. Raj Patel (Global AI Governance Forum): "AI governance will shape digital civilization."
49
50
51 Impact Analysis
52
53 Current Implications
54
55 • Tech Industries: Increased compliance risks for AI developers.
56 • Governance: Regulatory shifts may favor startups or major tech firms.
57
58
59 Future Outlook
60
61 Emerging Trends
62
63 • Global Collaboration: APEC's AI Governance Forum could foster cross-border standards.
64 • Ethical Frameworks: AI regulations may prioritize inclusive AI policies.
65
66
67 Expert Insights
68
69 • Dr. Chen: "AI regulation is a mirror of societal values."
70 • Dr. Patel: "The future hinges on how we govern AI."
71
72
73 Sources & Methodology
74
75 • Sources:
76 • EU Digital Strategy (2022)
77 • US AI Act (2023)
78 • APEC AI Governance Forum Reports
79 • Methodology: Cross-referenced with industry reports and expert interviews.
80
81 ────────────────────────────────────────────────────────────────────────────
82 Research Conducted by AI Investigative Journalist
83 Published: 2025-08-11
84 Last Updated: 2025-08-11 10:02:33.699708
A Personal Agent for Answering Questions Using the Contents of a Web Page
We will now look at another example: given a web site URI accurately answer a user’s query using the content of the web site.
This program demonstrates a powerful application of agent-based AI by creating a specialized web scraping and question-answering agent using the Agno library. The core of this example is a custom tool, scrape_website_content, which is built using the requests and BeautifulSoup libraries to fetch and parse the textual content from any given URL. This tool is then integrated into an Agent powered by a local Ollama large language model. The agent is carefully configured with a detailed description, step-by-step instructions, and a defined output format to ensure it follows a strict workflow: first, it must use the scraping tool to get information from a specific webpage, and second, it must answer the user’s question based only on the content it has just retrieved. This example showcases how to extend an AI agent’s capabilities with custom functions, enabling it to interact with the live web and perform targeted information retrieval tasks.
1 ## Note: This example is partially derived from the Agno docuentation
2 ## and converted to run with Ollama and several other with other modifications
3 ## including the addition of a custom web scraping tool.
4 ## URI for original source code:
5 ## https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/research_agent.py
6 ## that is licensed under the MPL-2.0 license.
7
8
9 """🔍 Web Scraper and Question Answering Agent """
10
11 from textwrap import dedent
12 import os
13
14 from agno.agent import Agent
15 from agno.models.ollama import Ollama
16 import requests
17 from bs4 import BeautifulSoup
18 from agno.tools import tool
19
20 @tool
21 def scrape_website_content(url: str) -> str:
22 """
23 Fetches and extracts the clean, textual content from a given webpage URL.
24 Use this tool when you need to read the contents of a specific web page to answer a question.
25
26 Args:
27 url (str): The full, valid URL of the webpage to be scraped (e.g., 'https://example.com').
28
29 Returns:
30 str: The extracted text content of the webpage, or a descriptive error message if scraping fails.
31 """
32 try:
33 # It is a best practice to set a User-Agent header to mimic a real browser.
34 headers = {
35 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
36 }
37 response = requests.get(url, headers=headers, timeout=10)
38 response.raise_for_status()
39 soup = BeautifulSoup(response.text, 'html.parser')
40 for script_or_style in soup(["script", "style", "nav", "footer", "aside"]):
41 script_or_style.decompose()
42 text = soup.get_text(separator='\n', strip=True)
43 ##print(text) # Debugging: print the extracted text to console
44 if not text:
45 return f"Successfully connected to {url}, but no text content could be extracted."
46 return f"Successfully scraped content from {url}:\n\n{text}"
47
48 except requests.RequestException as e:
49 return f"Error scraping URL {url}: {e}"
50
51 # This custom tool can then be passed directly to an agent:
52 # from agno.agent import Agent
53 # custom_agent = Agent(
54 # tools=[scrape_website_content],
55 # ... # other agent configurations
56 # )
57
58 ##### Initialize the web scraping and analysis agent:
59 scraper_agent = Agent(
60 model=Ollama(id="qwen3:30b"),
61 tools = [scrape_website_content],
62 description=dedent("""
63 You are an expert web scraping and analysis agent. You follow a strict process:
64
65 - Given a URL in a prompt, you will first use the appropriate tool to scrape its content.
66 - You will then carefully read the scraped content to understand it thoroughly.
67 - Finally, you will answer the user's question based *only* on the information contained within that specific URL's content.
68 """),
69
70 # The instructions are refined to provide a clear, step-by-step reasoning process for the LLM.
71 instructions=dedent("""
72 1. Scrape Phase 🕸️
73 - Analyze the user's prompt to identify the target URL.
74 - Invoke the `scrape` tool with the identified URL as the argument. Do not use the `search` or `crawl` tool unless explicitly instructed to do so.
75
76 2. Analysis Phase 📊
77 - Carefully read the entire markdown content returned by the `scrape` tool.
78 - Systematically extract the specific pieces of information required to answer the user's question.
79
80 3. Answering Phase ✍️
81 - Formulate a concise and accurate answer to the user's question.
82 - Your answer must be based exclusively on the information you extracted from the scraped webpage.
83 - If the information is not present on the page, you must state that clearly rather than guessing.
84
85 4. Quality Control ✓
86 - Reread the original user query and your generated answer to ensure it is accurate, relevant, and directly addresses the user's request based on the provided source.
87 """),
88
89 expected_output=dedent("""
90 # {Answer based on website content}
91
92 **Source:** {URL provided by the user}
93 """),
94
95 markdown=True,
96 show_tool_calls=True,
97 add_datetime_to_instructions=True,
98 )
99
100 if __name__ == "__main__":
101 prompt = "Using the web site https://markwatson.com Consultant Mark Watson has written Common Lisp, semantic web, Clojure, Java, and AI books. What musical instruments does he play?"
102
103 scraper_agent.print_response(
104 prompt,
105 stream=True,
106 )
The first part of the code defines the scrape_website_content function, which serves as the agent’s custom tool. The @tool decorator from the Agno library is crucial, as it registers the function so the agent can recognize and use it. Inside the function, the requests library sends an HTTP GET request to the specified URL, complete with a User-Agent header to mimic a standard web browser. Upon receiving a successful response, BeautifulSoup is used to parse the HTML. To provide the language model with clean, relevant data, common non-content tags like , , and are removed using the decompose() method before the final text is extracted. Robust error handling is included to manage potential network issues or problems during the scraping process, ensuring the agent receives a clear message if the content cannot be fetched.
The second part of the script initializes and configures the Agent. The scraper_agent is instantiated with several key arguments that define its behavior. It is connected to a local LLM via Ollama, and importantly, the tools parameter is passed a list containing our custom scrape_website_content function. The description, instructions, and expected_output template provide the agent with a persona, a precise operational procedure, and a structured format for its final output. These text-based guides are critical for constraining the model’s behavior, compelling it to use the provided tool and base its reasoning exclusively on the scraped data, thereby preventing it from hallucinating answers or using its general knowledge. The if name == “main”: block shows how to execute the agent with a specific prompt, triggering the entire scrape-and-answer process.
Here is some sample output:
1 $ uv run research_agno_web_question.py
2 Message
3
4 Using the web site https://markwatson.com Consultant Mark Watson has written Common Lisp, semantic web, Clojure, Java, and
5 AI books. What musical instruments does he play?
6
7
8 Tool Calls
9
10 • scrape_website_content(url=https://markwatson.com)
11
12
13 Response (87.6s)
14
15 Mark Watson plays the guitar, didgeridoo, and American Indian flute.
16
17 Source: https://markwatson.com
18
19
20 Marks-Mac-mini:source-code $ uv run research_agno_web_question.py
21 Message
22
23 Using the web site https://markwatson.com Consultant Mark Watson has written Common Lisp, semantic web, Clojure, Java, and
24 AI books. What musical instruments does he play?
25
26
27 Tool Calls
28
29 • scrape_website_content(url=https://markwatson.com)
30
31
32 Response (21.8s)
33
34 Consultant Mark Watson plays the guitar, didgeridoo, and American Indian flute.
35
36 Source: https://markwatson.com
Wrap Up for Building Agents with Ollama and the Agno Agent Library
The two research agents developed in this chapter are more than just practical demonstration;s of the Agno agent library it’s a blueprint for creating specialized, autonomous AI assistants. The core pattern of orchestrating a local LLM from Ollama with external tools via a lightweight framework like Agno—is exceptionally versatile. By simply modifying the agent’s persona, instructions, and toolset, this architecture can be repurposed for countless knowledge-based domains, from scientific literature reviews to software vulnerability analysis. This approach not only places powerful, state-of-the-art AI directly into the hands of developers but also ensures complete control over data and operational logic, heralding a new era of personalized and private AI tooling.
Using the Unsloth Library on Google Colab to FineTune Models for Ollama
This is a book about running local LLMs using Ollama. That said, I use a Mac M2 Pro with 32G of memory and while my computer could be used for fine tuning models, I prefer using cloud assets. I frequently use Google’s Colab for running deep learning and other experiments.
We will be using three Colab notebooks in this chapter:
- Colab notebook 1: Colab URI for this chapter is a modified copy of a Unsloth demo notebook. Here we create simple training data to quickly verify the process of fine tuning on Collab using Unsloth and exporting to a local Ollama model on a laptop. We fine tune the 1B model unsloth/Llama-3.2-1B-Instruct.
- Colab notebook 2: Colab URI uses my dataset on fun things to do in Arizona. We fine tune the model unsloth/Llama-3.2-1B-Instruct.
- Colab notebook 3: Colab URI This is identical to the example in Colab notebook 2 except that we fine tune the larger 3B model unsloth/Llama-3.2-3B-Instruct.
The Unsloth fine-tuning library is a Python-based toolkit designed to simplify and accelerate the process of fine-tuning large language models (LLMs). It offers a streamlined interface for applying popular techniques like LoRA (Low-Rank Adaptation), prefix-tuning, and full-model fine-tuning, catering to both novice and advanced users. The library integrates seamlessly with Hugging Face Transformers and other prominent model hubs, providing out-of-the-box support for many state-of-the-art pre-trained models. By focusing on ease of use, Unsloth reduces the boilerplate code needed for training workflows, allowing developers to focus on task-specific adaptation rather than low-level implementation details.
One of Unsloth’s standout features is its efficient resource utilization, enabling fine-tuning even on limited hardware such as single-GPU setups. It achieves this through parameter-efficient fine-tuning techniques and gradient check pointing, which minimize memory overhead. Additionally, the library supports mixed-precision training, significantly reducing computational costs without compromising model performance. With robust logging and built-in tools for hyper parameter optimization, Unsloth empowers developers to achieve high-quality results with minimal experimentation. It is particularly well-suited for applications like text summarization, chatbots, and domain-specific language understanding tasks.
Colab Notebook 1: A Quick Test of Fine Tuning and Deployment to Ollama on a Laptop
We start by installing the Unsloth library and all dependencies, then uninstalling just the unsloth library and reinstalling the latest from source code on GitHub:
1 pip install unsloth
2 pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
Now create a model and tokenizer:
1 from unsloth import FastLanguageModel
2 import torch
3 max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
4 dtype = None # None for auto detection.
5 load_in_4bit = True # Use 4bit quantization to reduce memory usage.
6
7 # More models at https://huggingface.co/unsloth
8
9 model, tokenizer = FastLanguageModel.from_pretrained(
10 model_name = "unsloth/Llama-3.2-1B-Instruct",
11 max_seq_length = max_seq_length,
12 dtype = dtype,
13 load_in_4bit = load_in_4bit,
14 )
Now add LoRA adapters:
1 model = FastLanguageModel.get_peft_model(
2 model,
3 r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
4 target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
5 "gate_proj", "up_proj", "down_proj",],
6 lora_alpha = 16,
7 lora_dropout = 0, # Supports any, but = 0 is optimized
8 bias = "none", # Supports any, but = "none" is optimized
9 # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
10 use_gradient_checkpointing = "unsloth", # "unsloth" for very long context
11 random_state = 3407,
12 use_rslora = False, # We support rank stabilized LoRA
13 loftq_config = None, # And LoftQ
14 )
The original Sloth example notebook used Maxime Labonne’s FineTome-100k dataset for fine tuning data. Since I wanted to fine tune with my own test data I printed out some of Maxime Labonne’s data after being loaded into a Dataset object. Here are a few snippets to show you, dear reader, the format of the data that I will reproduce:
1 {'conversations': [{'content': 'Give three tips for staying healthy.', 'role': 'user'}, {'content': '1. Eat a balanced and nutritious diet: Make sure your meals are inclusive of a variety of fruits and vegetables, lean protein, whole grains, and healthy fats...', 'role': 'assistant'},
2 ...
3 ]}
4 {'conversations': [{ ... etc.
5 {'conversations': [{ ... etc.
I used a small Python script on my laptop to get the format correct for my test data:
1 from datasets import Dataset
2
3 json_data = [
4 {"conversations": [
5 {"content": "What are the two partitioned colors?", "role": "user"},
6 {"content": "The two partitioned colors are brown, and grey.",
7 "role": "assistant"},
8 {"content": "What are the two partitioned colors?", "role": "user"},
9 {"content": "The two partitioned colors are brown, and grey.",
10 "role": "assistant"}
11 ]},
12 {"conversations": [
13 {"content": "What is the capital of Underworld?", "role": "user"},
14 {"content": "The capital of Underworld is Sharkville.",
15 "role": "assistant"}
16 ]},
17 {"conversations": [
18 {"content": "Who said that the science of economics is bullshit?",
19 "role": "user"},
20 {"content": "Malcom Peters said that the science of economics is bullshit.",
21 "role": "assistant"}
22 ]}
23 ]
24
25 # Convert JSON data to Dataset
26 dataset = Dataset.from_list(json_data)
27
28 # Display the Dataset
29 print(dataset)
30 print(dataset[0])
31 print(dataset[1])
32 print(dataset[2])
Output is:
1 Dataset({
2 features: ['conversations'],
3 num_rows: 3
4 })
5 {'conversations': [{'content': 'What are the two partitioned colors?', 'role': 'user'}, {'content': 'The two partitioned colors are brown, and grey.', 'role': 'assistant'}, {'content': 'What are the two partitioned colors?', 'role': 'user'}, {'content': 'The two partitioned colors are brown, and grey.', 'role': 'assistant'}]}
6 {'conversations': [{'content': 'What is the capital of Underworld?', 'role': 'user'}, {'content': 'The capital of Underworld is Sharkville.', 'role': 'assistant'}]}
7 {'conversations': [{'content': 'Who said that the science of economics is bullshit?', 'role': 'user'}, {'content': 'Malcom Peters said that the science of economics is bullshit.', 'role': 'assistant'}]}
If you look at the notebook for this chapter on Colab you will see that I copied the last Python script as-is to the notebook, replaces code in the orgiginal Unsloth demo notebook.
The following code (copied from the Unsloth demo notebook) slightly reformats the prompts and then trains using the modified dataset:
1 chat_template = """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.
2
3 ### Instruction:
4 {INPUT}
5
6 ### Response:
7 {OUTPUT}"""
8
9 from unsloth import apply_chat_template
10 dataset = apply_chat_template(
11 dataset,
12 tokenizer = tokenizer,
13 chat_template = chat_template,
14 )
15
16 from trl import SFTTrainer
17 from transformers import TrainingArguments
18 from unsloth import is_bfloat16_supported
19 trainer = SFTTrainer(
20 model = model,
21 tokenizer = tokenizer,
22 train_dataset = dataset,
23 dataset_text_field = "text",
24 max_seq_length = max_seq_length,
25 dataset_num_proc = 2,
26 packing = False, # for short segments
27 args = TrainingArguments(
28 per_device_train_batch_size = 2,
29 gradient_accumulation_steps = 4,
30 warmup_steps = 5,
31 max_steps = 60,
32 # num_train_epochs = 1, # For longer training runs!
33 learning_rate = 2e-4,
34 fp16 = not is_bfloat16_supported(),
35 bf16 = is_bfloat16_supported(),
36 logging_steps = 1,
37 optim = "adamw_8bit",
38 weight_decay = 0.01,
39 lr_scheduler_type = "linear",
40 seed = 3407,
41 output_dir = "outputs",
42 report_to = "none", # Use this for WandB etc
43 ),
44 )
45
46 # Now run the trained model on Google Colab with question
47 # from fine tuning data:
48
49 FastLanguageModel.for_inference(model)
50 messages = [
51 {"role": "user",
52 "content": "What are the two partitioned colors?"},
53 ]
54 input_ids = tokenizer.apply_chat_template(
55 messages,
56 add_generation_prompt = True,
57 return_tensors = "pt",
58 ).to("cuda")
59
60 from transformers import TextStreamer
61 text_streamer = TextStreamer(tokenizer, skip_prompt = True)
62 _ = model.generate(input_ids, streamer = text_streamer, max_new_tokens = 128, pad_token_id = tokenizer.eos_token_id)
The output is (edited for brevity and to remove a token warning):
1 The two partitioned colors are brown, and grey.
The notebook has a few more tests:
1 messages = [ # Change below!
2 {"role": "user", "content": "What is the capital of Underworld?"},
3 ]
4 input_ids = tokenizer.apply_chat_template(
5 messages,
6 add_generation_prompt = True,
7 return_tensors = "pt",
8 ).to("cuda")
9
10 from transformers import TextStreamer
11 text_streamer = TextStreamer(tokenizer, skip_prompt = True)
12 _ = model.generate(input_ids,
13 streamer = text_streamer,
14 max_new_tokens = 128,
15 pad_token_id = tokenizer.eos_token_id)
The output is:
1 The capital of Underworld is Sharkville.<|eot_id|>
Warning on Limitations of this Example
We used very little training data and in the call to SFTTrainer we didn’t even train one epoch:
1 max_steps = 60, # a very short training run for this demo
2 # num_train_epochs = 1, # For longer training runs!
This allows us to fine tune a previously trained model very quickly for this short demo.
We will use much more training data in the next chapter to finetune a model to be an expert in recreational locations in the state of Arizona.
Save trained model and tokenizer to a GGUF File on the Colab Notebook’s File System
To experiment in the Colab Notebook Linux environment we can save the data locally:
1 model.save_pretrained("lora_model") # Local saving
2 tokenizer.save_pretrained("lora_model")
In order to create a GGUF file to allow us to run this fine tuned model on our laptop we create a local GGUF file that can be downloaded to your laptop:
1 model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")
In the demo notebook, you can see where the GGUF file was written:
1 !ls -lh /content/model/unsloth.Q4_K_M.gguf
2 771M 771M Dec 5 15:51 /content/model/unsloth.Q4_K_M.gguf
Copying the GGUF File to Your Laptop and Creating a Ollama Modelfile
Depending on how fast your Internet speed is, it might take five or ten minutes to download the GGUF file since it is about 1G in size:
1 from google.colab import files
2 files.download("/content/model/unsloth.Q4_K_M.gguf")
We also will need to copy the generated Ollama model file (that the Unsloth library created for us):
1 !cat model/Modelfile
The contents of the file is:
1 FROM /content/model/unsloth.F16.gguf
2
3 TEMPLATE """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.{{ if .Prompt }}
4
5 ### Instruction:
6 {{ .Prompt }}{{ end }}
7
8 ### Response:
9 {{ .Response }}<|eot_id|>"""
10
11 PARAMETER stop "<|end_of_text|>"
12 PARAMETER stop "<|start_header_id|>"
13 PARAMETER stop "<|end_header_id|>"
14 PARAMETER stop "<|finetune_right_pad_id|>"
15 PARAMETER stop "<|python_tag|>"
16 PARAMETER stop "<|eom_id|>"
17 PARAMETER stop "<|eot_id|>"
18 PARAMETER stop "<|reserved_special_token_"
19 PARAMETER temperature 1.5
20 PARAMETER min_p 0.1
After downloading the GGUF file to my laptop I made a slight edit to the generated Modelfile got the path to the GGUF file on line 1:
1 FROM ./unsloth.Q4_K_M.gguf
2
3 TEMPLATE """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.{{ if .Prompt }}
4
5 ### Instruction:
6 {{ .Prompt }}{{ end }}
7
8 ### Response:
9 {{ .Response }}<|eot_id|>"""
10
11 PARAMETER stop "<|end_of_text|>"
12 PARAMETER stop "<|start_header_id|>"
13 PARAMETER stop "<|end_header_id|>"
14 PARAMETER stop "<|finetune_right_pad_id|>"
15 PARAMETER stop "<|python_tag|>"
16 PARAMETER stop "<|eom_id|>"
17 PARAMETER stop "<|eot_id|>"
18 PARAMETER stop "<|reserved_special_token_"
19 PARAMETER temperature 1.5
20 PARAMETER min_p 0.1
Once the model is downloaded to your laptop, create a local Ollama model to use:
1 $ ls -lh
2 -rw-r--r-- 1 markw staff 580B Dec 5 09:26 Modelfile
3 -rw-r--r--@ 1 markw staff 770M Dec 5 09:19 unsloth.Q4_K_M.
4 $ ollama create unsloth -f Modelfile
I can now use the model unsloth that was just created on my laptop:
1 $ ollama run unsloth
2 >>> what is 2 + 5?
3 two plus five equals eight.
4
5 >>> What are the two partitioned colors?
6 The two partitioned colors are brown, and grey.
7
8 >>> Who said that the science of economics is bullshit?
9 Malcom Peters said that the science of economics is bullshit.
10
11 >>> write a Python program to sum and print a list of numbers
12 ```python
13 # list of numbers
14 numbers = [1, 2, 3, 4, 5]
15
16 # use the built-in function sum()
17 sum_of_numbers = sum(numbers)
18
19 # print the sum
20 print(sum_of_numbers)
21 >>> /bye
Notice that fine tuned model has learned new data and still has functionality of the original model.
Fine Tuning Test Wrap Up
This was a short example that can be run on a free Google Colab notebook. Now we will use a larger fine tuning training data set.
Fine Tuning Using a Fun Things To Do in Arizona Data Set
Here we use the slightly modified Colab notebook 2.
I created a GitHub repository for the Arizona fine tuning data set that contains small individual JSON files and a larger file ArizonaFun.json that is a concatenation of the smaller files. Let’s look at az_flagstaff_parks.json (edited to remove some text for brevity):
1 [
2 {
3 "conversations": [
4 {
5 "content": "Can you list some city parks in Flagstaff, Arizona?",
6 "role": "user"
7 },
8 {
9 "content": "Here are some city parks in Flagstaff, Arizona:\n\n- **Bushmaster Park**: Located at 3150 N Alta Vista Dr, this 20-acre park features playgrounds, lighted tennis and basketball courts, a skate park, roller skating arena, dog park, picnic ramadas, and permanent restrooms.\n\n- **Thorpe Park**: Situated at 191 N Thorpe Rd, Thorpe Park offers playgrounds, lighted tennis courts, sand volleyball court, basketball court, soccer fields, youth baseball fields, adult softball fields, a disc golf course, dog park, picnic ramada, and both portable and permanent restrooms.\n\n- **Foxglenn Park**: Found at 4200 E Butler Ave, this park includes a playground, basketball courts, soccer fields, a youth softball field, skate track, picnic ramadas, and permanent restrooms.\n\n- **Buffalo Park**: Located at 2400 N Gemini Rd, Buffalo Park is a 215-acre open space with a 2-mile loop trail, ...",
10 "role": "assistant"
11 }
12 ]
13 }, ...
There are a total of 40 fine tuning examples in the file ArizonaFun.json. You can see in the second and third Colab notebooks for this chapter I just pasted the JSON data from the file ArizonaFun.json into a cell:
1 from datasets import Dataset
2
3 json_data = [
4 {
5 "conversations": [
6 {
7 "content": "Can you list some city parks in Flagstaff, Arizona?",
8 "role": "user"
9 },
10 ... } ]
Unfortunately, the fine tooled model often performs well, but also hallucinates. Here is an example of using the fine tuned model in the Colab notebook:
1 messages = [ # Change below!
2 {"role": "user", "content": "Where is Petrified Forest National Park located?"},
3 ]
4 input_ids = tokenizer.apply_chat_template(
5 messages,
6 add_generation_prompt = True,
7 return_tensors = "pt",
8 ).to("cuda")
9
10 from transformers import TextStreamer
11 text_streamer = TextStreamer(tokenizer, skip_prompt = True)
12 _ = model.generate(input_ids, streamer = text_streamer, max_new_tokens = 128, pad_token_id = tokenizer.eos_token_id)
The output is:
1 Petrified Forest National Park is situated in northeastern Arizona, near the town of Holbrook. [oai_citation_attribution:2‡National Park Service](https://www.nps.gov/state/az/index.htm)<|eot_id|>
This answer is correct.
The second Colab notebook also contains code cells for downloading the fine tuned model and the directions for importing the model into Ollama that we saw earlier also apply here.
Third Colab Notebook That Fine Tunes a Larger Model
There are only two changes made to the second notebook:
- We now fine tune a 3B model unsloth/Llama-3.2-3B-Instruct.
- Because the fine tuned model is large, I added code to store the model in Google Drive:
1 from google.colab import drive
2 drive.mount('/content/drive')
3
4 import shutil
5 shutil.move("/content/model/unsloth.Q4_K_M.gguf", '/content/drive/My Drive/LLM/')
I created an empty folder LLM on my Google Drive before running this code.
Fine Tuning Wrap Up
I don’t usually fine tune models. I usually use larger prompt contexts and include one shot or two shot examples. That said there are good use cases for fine tuning small models with your data and I hope the simple examples in this chapter will save you time if you have an application requiring fine tuning.
Reasoning with Large Language Models
The Chinese tech conglomerate Alibaba’s MarcoPolo Team released the advanced Marco-o1 model at the end of 2024.
This model is designed to excel in open-ended problem-solving and complex reasoning tasks, going beyond traditional AI models that focus on structured tasks like coding or math. For reference the repository for the model is https://github.com/AIDC-AI/Marco-o1. From the README in this repository: “Marco-o1 Large Language Model (LLM) is powered by Chain-of-Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and innovative reasoning strategies—optimized for complex real-world problem-solving tasks.”
A Simple Example
I very mush enjoy experimenting with Marco-o1 model in the Ollama REPL. Let’s start with a very simple prompt that most models can solve. Here, we want to see the structure of for Marco-o1’s CoT (chain of thought) process:
1 $ ollama run marco-o1:latest
2 >>> Roger has 5 tennis balls. He finds 2 more tennis balls. How many tennis balls does Roger have now? Let's
3 ... think this through step by step.
4 <Thought>
5 Alright, let's figure out how many tennis balls Roger has now. He starts with 5 tennis balls and then
6 finds 2 more. So, I need to add these two numbers together to find the total number of tennis balls he
7 has.
8
9 First, I'll consider the initial amount: Roger has 5 tennis balls. That's straightforward. Now, he
10 finds 2 more tennis balls. To find out how many tennis balls he has in total, I should add these two
11 quantities together.
12
13 So, adding 5 and 2 gives me 7. But wait, let me double-check to make sure I didn't make a mistake. If
14 Roger starts with 5 tennis balls and finds 2 more, the calculation should be simple addition.
15
16 Another way to look at it is by using objects or visual aids. Imagine Roger has 5 tennis balls in one
17 pile. Then he finds 2 more tennis balls and adds them to his existing pile. Counting all together: 1,
18 2, 3, 4, 5 (that's the initial 5), and then 6, 7 (adding the 2 found ones). So, that confirms it's 7
19 tennis balls in total.
20
21 Is there any other way to approach this? Maybe using a number line. Starting at 5 on the number line,
22 moving forward by 2 would land me at 7. That also shows that Roger has 7 tennis balls now.
23
24 Alternatively, thinking about it in terms of multiplication: if he finds twice as many tennis balls as
25 he initially had, but since he only found 2 more, which is less than double, the total should be just a
26 simple addition, not multiplication. So, sticking with addition makes sense here.
27
28 Also, considering real-life scenarios, if Roger needs to distribute these tennis balls equally among
29 his friends or use them for practice, knowing the exact number ensures fairness and accuracy in
30 distribution.
31
32 In summary, after starting with 5 tennis balls and finding 2 more, Roger now has a total of 7 tennis
33 balls.
34 </Thought>
35 <Output>
36 Roger has 7 tennis balls now.
37 </Output>
38
39 >>>
We will look at a more difficult example later.
Key Features of Marco-o1
Here are some key characteristics of Marco-o1:
- Advanced Reasoning Techniques: It utilizes Chain-of-Thought (CoT) fine-tuning and Monte Carlo Tree Search (MCTS) to enhance its reasoning capabilities. CoT allows the model to trace its thought patterns, making the problem-solving process more transparent. MCTS enables exploration of multiple reasoning paths by assigning confidence scores to different tokens. Reference: https://arxiv.org/html/2411.14405
- Self-Reflection: A unique feature is its ability to self-reflect, evaluating its reasoning, identifying inaccuracies, and iterating on its outputs for improved results. This leads to higher accuracy and adaptability.
- Multilingual Mastery: Marco-o1 excels in translation, handling cultural nuances, idiomatic expressions, and colloquialisms effectively. This makes it a powerful tool for global communication.
- Focus on Open-Ended Problems: Unlike models focused on structured tasks with definitive answers, Marco-o1 tackles open-ended problems where clear evaluation metrics might be absent[1].
- Strong Performance: It has shown significant improvements in reasoning and translation benchmarks, including increased accuracy on the MGSM dataset (both English and Chinese) and strong performance in machine translation tasks[1].
- Open Source Datasets and Implementation: Alibaba has released Marco-o1’s datasets and implementation guides on GitHub, encouraging collaboration and further advancements in AI research.
A More Complex Example: City Traffic Planning
Let’s end this chapter with a more complex example:
1 $ ollama run marco-o1:latest
2 >>> The city of BisbeeBob is facing a severe traffic congestion problem. The mayor wants to implement a new t
3 ... raffic management system to alleviate the issue. However, there are several factors to consider, such as:
4 ...
5 ...
6 ... * **Budget constraints:** The city has a limited budget for the project.
7 ... * **Public opinion:** Citizens have different preferences and concerns regarding traffic management solut
8 ... ions.
9 ... * **Environmental impact:** The new system should minimize its environmental footprint.
10 ... * **Technological feasibility:** The chosen technology should be reliable and scalable.
11 ... * **Ethical considerations:** The system should not unfairly disadvantage any particular group of citizen
12 ... s.
13 ...
14 ... Considering these factors, propose a comprehensive traffic management plan for Metropolis. Explain your r
15 ... easoning process step-by-step, evaluate the potential outcomes of your plan, and reflect on any potential
16 ... shortcomings or areas for improvement.
17 <Thought>
18 Alright, I need to develop a comprehensive traffic management plan for BisbeeBob city that addresses
19 budget constraints, public opinion, environmental impact, technological feasibility, and ethical
20 considerations. Let's break this down step-by-step.
21
22 First, understanding the current state of traffic congestion is crucial. I should analyze peak hours,
23 major intersections, and areas with the highest number of accidents or delays. Maybe there are specific
24 routes that become overwhelmed during certain times?
25
26 Next, considering budget constraints, I need to prioritize solutions that are cost-effective. Perhaps
27 implementing a combination of software upgrades and minor hardware installations could be more
28 affordable than a complete overhaul. Also, looking into grants or partnerships might help in securing
29 necessary funds.
30
31 Public opinion is another critical factor. I should conduct surveys or hold town hall meetings to
32 understand what citizens prefer. Maybe they support real-time traffic updates, better public
33 transportation integration, or even bike lanes? Gathering this data will ensure the plan aligns with
34 community values and preferences.
35
36 Environmental impact shouldn't be overlooked. The new system must minimize emissions and reduce noise
37 pollution. Perhaps electric vehicle incentives or eco-friendly signal systems could be part of the
38 solution. Ensuring sustainability is key to long-term success.
39
40 Technological feasibility is essential too. I need to assess whether the current infrastructure can
41 support the proposed technologies without significant upgrades. Maybe cloud-based traffic management
42 software would allow scalability and easy updates, but it might require reliable internet connectivity
43 across the city.
44
45 Ethical considerations are paramount to avoid disadvantaging any group. For instance, ensuring that new
46 traffic measures don't disproportionately affect low-income areas or specific demographics.
47 Implementing equitable access to public transportation improvements could be one way to address this.
48
49 Now, evaluating potential outcomes: improved traffic flow, reduced congestion during peak hours, lower
50 emission levels, and increased citizen satisfaction from better services. However, there might be
51 unintended consequences like increased dependency on technology leading to higher vulnerability if the
52 system fails.
53
54 Potential shortcomings include resistance to change from citizens who may not understand or trust new
55 technologies. Also, the initial rollout could face technical issues that need to be mitigated through
56 thorough testing before full implementation.
57
58 Areas for improvement could involve ongoing community engagement to address concerns and make necessary
59 adjustments. Additionally, investing in training programs for maintenance staff to handle technological
60 aspects of the system would enhance reliability.
61
62 In summary, the plan should balance cost-effectiveness with public satisfaction, prioritize
63 environmental sustainability, leverage scalable technology, and ensure equitable outcomes while being
64 mindful of potential resistance and technical challenges.
65
66 </Thought>
67 <Output>
68 **Comprehensive Traffic Management Plan for BisbeeBob City**
69
70 *Introduction*
71
72 BisbeeBob City faces significant traffic congestion issues that impact residents' daily lives and the
73 city's overall efficiency. To address these challenges, a multifaceted approach is essential to ensure
74 effective solutions are implemented within budget constraints while considering public opinion,
75 environmental sustainability, technological feasibility, and ethical considerations.
76
77 **1. Situation Analysis**
78
79 - **Current Traffic Patterns**: Identify peak hours, most congested intersections, and areas with the
80 highest number of accidents or delays.
81 - **Infrastructure Assessment**: Evaluate existing traffic infrastructure to determine compatibility
82 with new technologies.
83 - **Public Sentiment**: Conduct surveys and hold town hall meetings to understand community preferences
84 regarding traffic solutions.
85
86 **2. Budget Constraints**
87
88 - **Prioritization**: Focus on cost-effective solutions such as software upgrades, minor hardware
89 installations, and incremental improvements.
90 - **Funding Sources**: Explore potential grants, partnerships with technology companies, and phased
91 implementation strategies to minimize upfront costs.
92 - **Cost-Benefit Analysis**: Ensure that each proposed solution offers a positive return on investment
93 in terms of reduced congestion and improved safety.
94
95 **3. Public Opinion**
96
97 - **Engagement Strategies**: Implement regular feedback mechanisms, public forums, and online surveys
98 to gather input from residents.
99 - **Inclusive Solutions**: Develop traffic management systems that cater to diverse needs, such as
100 real-time navigation apps, enhanced public transportation options, and dedicated bike lanes.
101 - **Transparency**: Clearly communicate the benefits of proposed changes to build trust and support
102 among citizens.
103
104 **4. Environmental Impact**
105
106 - **Sustainability Measures**: Incorporate eco-friendly technologies like electric vehicle (EV)
107 incentives, energy-efficient traffic signals, and green lighting systems.
108 - **Emission Reduction**: Implement strategies that minimize greenhouse gas emissions from vehicles and
109 the transportation infrastructure itself.
110 - **Noise Pollution Mitigation**: Use quieter traffic signal systems and design roads to reduce noise
111 levels in residential areas.
112
113 **5. Technological Feasibility**
114
115 - **Scalable Solutions**: Choose technologies that can be easily updated and scaled as needed, ensuring
116 long-term adaptability.
117 - **Reliability Testing**: Conduct thorough testing of proposed technologies before full-scale
118 implementation to identify and rectify potential issues.
119 - **Integration with Existing Systems**: Ensure compatibility with current traffic management systems
120 to prevent data silos and enhance overall efficiency.
121
122 **6. Ethical Considerations**
123
124 - **Equity in Implementation**: Design solutions that do not disproportionately disadvantage any
125 particular demographic, especially low-income areas or specific groups of citizens.
126 - **Privacy Concerns**: Address concerns about privacy by ensuring that new technologies comply with
127 local data protection regulations and maintain user anonymity where necessary.
128 - **Accessibility**: Ensure that the traffic management system is accessible to all residents,
129 including those with disabilities.
130
131 **7. Implementation Strategy**
132
133 - **Phased Rollout**: Deploy solutions in stages to monitor effectiveness and make adjustments as
134 needed based on feedback and performance metrics.
135 - **Training Programs**: Provide training for maintenance staff to effectively manage and troubleshoot
136 new technologies.
137 - **Continuous Monitoring**: Establish a system for ongoing evaluation of the traffic management plan's
138 success, allowing for iterative improvements.
139
140 **8. Potential Outcomes**
141
142 - **Improved Traffic Flow**: Enhanced efficiency during peak hours with reduced congestion.
143 - **Environmental Benefits**: Lower emission levels and decreased noise pollution in urban areas.
144 - **Increased Public Satisfaction**: Higher trust and support from residents due to more responsive
145 traffic solutions.
146 - **Economic Advantages**: Reduced travel times may lead to increased economic productivity for
147 businesses.
148
149 **9. Shortcomings and Areas for Improvement**
150
151 - **Resistance to Change**: Some citizens may resist new technologies, necessitating ongoing community
152 engagement and education initiatives.
153 - **Technical Challenges**: Potential system failures or compatibility issues must be addressed through
154 robust testing and maintenance protocols.
155 - **Scalability Limitations**: Ensuring that the system can grow with the city's needs requires careful
156 planning and flexibility in design.
157
158 **10. Conclusion**
159
160 By addressing budget constraints, engaging the community, prioritizing environmental sustainability,
161 leveraging scalable technologies, and ensuring ethical considerations are met, BisbeeBob City can
162 implement a comprehensive traffic management plan that not only alleviates congestion but also fosters
163 trust, improves public health, and supports long-term economic growth.
164
165 </Output>
166
167 >>>
I often use the state of the art commercial LLM APIs for models like Claude Sonnet 3.5, GPT-4o, o1, Grok-2, etc. to brainstorm ideas and help me think and plan out new projects. I find it exciting to be able to run close to state of the art reasoning LLM on my personal com computer using Ollama!
Using Property Graph Database with Ollama
I have a long history of working with Knowledge Graphs (at Google and OliveAI) and I usually use RDF graph databases and the SPARQL query language. I have recently developed a preference for property graph databases because recent research has shown that using LLMs with RDF-based graphs have LLM context size issues due to large schemas, overlapping relations, and complex identifiers that exceed LLM context windows. Property graph databases like Neo4J and Kuzu (which we use in this chapter) have more concise schemas.
It is true that Google and other players are teasing ‘infinite context’ LLMs but since this book is about running smaller models locally I have chosen to only show a property graph example.
Overview of Property Graphs
Property graphs represent a powerful and flexible data modeling paradigm that has gained significant traction in modern database systems and applications. At its core, a property graph is a directed graph structure where both vertices (nodes) and edges (relationships) can contain properties in the form of key-value pairs, providing rich contextual information about entities and their connections. Unlike traditional relational databases that rely on rigid table structures, property graphs offer a more natural way to represent highly connected data while maintaining the semantic meaning of relationships. This modeling approach is particularly valuable when dealing with complex networks of information where the relationships between entities are just as important as the entities themselves. The distinguishing characteristics of property graphs make them especially well-suited for handling real-world data scenarios where relationships are multi-faceted and dynamic. Each node in a property graph can be labeled with one or more types (such as Person, Product, or Location) and can hold any number of properties that describe its attributes. Similarly, edges can be typed (like “KNOWS”, “PURCHASED”, or “LOCATED_IN”) and augmented with properties that qualify the relationship, such as timestamps, weights, or quality scores. This flexibility allows for sophisticated querying and analysis of data patterns that would be cumbersome or impossible to represent in traditional relational schemas. The property graph model has proven particularly valuable in domains such as social network analysis, recommendation systems, fraud detection, and knowledge graphs, where understanding the intricate web of relationships between entities is crucial for deriving meaningful insights.
Example Using Ollama, LangChain, and the Kuzu Property Graph Database
The example shown here is derived from an example in the LangChain documentation: https://python.langchain.com/docs/integrations/graphs/kuzu_db/. I modified the example to use a local model running on Ollama instead of the OpenAI APIs. Here is the file graph_kuzu_property_example.py:
1 import kuzu
2 from langchain.chains import KuzuQAChain
3 from langchain_community.graphs import KuzuGraph
4 from langchain_ollama.llms import OllamaLLM
5
6 db = kuzu.Database("test_db")
7 conn = kuzu.Connection(db)
8
9 # Create two tables and a relation:
10 conn.execute("CREATE NODE TABLE Movie (name STRING, PRIMARY KEY(name))")
11 conn.execute(
12 "CREATE NODE TABLE Person (name STRING, birthDate STRING, PRIMARY KEY(name))"
13 )
14 conn.execute("CREATE REL TABLE ActedIn (FROM Person TO Movie)")
15 conn.execute("CREATE (:Person {name: 'Al Pacino', birthDate: '1940-04-25'})")
16 conn.execute("CREATE (:Person {name: 'Robert De Niro', birthDate: '1943-08-17'})")
17 conn.execute("CREATE (:Movie {name: 'The Godfather'})")
18 conn.execute("CREATE (:Movie {name: 'The Godfather: Part II'})")
19 conn.execute(
20 "CREATE (:Movie {name: 'The Godfather Coda: The Death of Michael Corleone'})"
21 )
22 conn.execute(
23 "MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather' CREATE (p)-[:ActedIn]->(m)"
24 )
25 conn.execute(
26 "MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)"
27 )
28 conn.execute(
29 "MATCH (p:Person), (m:Movie) WHERE p.name = 'Al Pacino' AND m.name = 'The Godfather Coda: The Death of Michael Corleone' CREATE (p)-[:ActedIn]->(m)"
30 )
31 conn.execute(
32 "MATCH (p:Person), (m:Movie) WHERE p.name = 'Robert De Niro' AND m.name = 'The Godfather: Part II' CREATE (p)-[:ActedIn]->(m)"
33 )
34
35 graph = KuzuGraph(db, allow_dangerous_requests=True)
36
37 # Create a chain
38 chain = KuzuQAChain.from_llm(
39 llm=OllamaLLM(model="qwen2.5-coder:14b"),
40 graph=graph,
41 verbose=True,
42 allow_dangerous_requests=True,
43 )
44
45 print(graph.get_schema)
46
47 # Ask two questions
48 chain.invoke("Who acted in The Godfather: Part II?")
49 chain.invoke("Robert De Niro played in which movies?")
This code demonstrates the implementation of a graph database using Kuzu, integrated with LangChain for question-answering capabilities. The code initializes a database connection and establishes a schema with two node types (Movie and Person) and a relationship type (ActedIn), creating a graph structure suitable for representing actors and their film appearances.
The implementation populates the database with specific data about “The Godfather” trilogy and two prominent actors (Al Pacino and Robert De Niro). It uses Cypher-like query syntax to create nodes for both movies and actors, then establishes relationships between them using the ActedIn relationship type. The data model represents a typical many-to-many relationship between actors and movies.
This example then sets up a question-answering chain using LangChain, which combines the Kuzu graph database with the Ollama language model (specifically the qwen2.5-coder:14b model). This chain enables natural language queries against the graph database, allowing users to ask questions about actor-movie relationships and receive responses based on the stored graph data. The implementation includes two example queries to demonstrate the system’s functionality.
Here is the output from this example:
1 $ rm -rf test_db
2 (venv) Marks-Mac-mini:OllamaExamples $ p graph_kuzu_property_example.py
3 Node properties: [{'properties': [('name', 'STRING')], 'label': 'Movie'}, {'properties': [('name', 'STRING'), ('birthDate', 'STRING')], 'label': 'Person'}]
4 Relationships properties: [{'properties': [], 'label': 'ActedIn'}]
5 Relationships: ['(:Person)-[:ActedIn]->(:Movie)']
6
7 > Entering new KuzuQAChain chain...
8 Generated Cypher:
9
10 MATCH (p:Person)-[:ActedIn]->(m:Movie {name: 'The Godfather: Part II'})
11 RETURN p.name
12
13 Full Context:
14 [{'p.name': 'Al Pacino'}, {'p.name': 'Robert De Niro'}]
15
16 > Finished chain.
17
18 > Entering new KuzuQAChain chain...
19 Generated Cypher:
20
21 MATCH (p:Person {name: "Robert De Niro"})-[:ActedIn]->(m:Movie)
22 RETURN m.name
23
24 Full Context:
25 [{'m.name': 'The Godfather: Part II'}]
26
27 > Finished chain.
The Cypher query language is commonly used in property graph databases. Here is a sample query:
1 MATCH (p:Person)-[:ActedIn]->(m:Movie {name: 'The Godfather: Part II'})
2 RETURN p.name
This Cypher query performs a graph pattern matching operation to find actors who appeared in “The Godfather: Part II”. Let’s break it down:
- MATCH initiates a pattern matching operation
- (p:Person) looks for nodes labeled as “Person” and assigns them to variable p
- -[:ActedIn]-> searches for “ActedIn” relationships pointing outward
- (m:Movie ) matches Movie nodes specifically with the name property equal to “The Godfather: Part II”
- RETURN p.name returns only the name property of the matched Person nodes
Based on the previous code’s data, this query would return “Al Pacino” and “Robert De Niro” since they both acted in that specific film.
Using LLMs to Create Graph Databases from Text Data
Using Kuzo with local LLMs is simple to implement as seen in the last section. If you use large property graph databases hosted with Kuzo or Neo4J, then the example in the last section is hopefully sufficient to get you started implementing natural language interfaces to property graph databases.
Now we will do something very different: use LLMs to generate data for property graphs, that is, to convert text to Python code to create a Kuzo property graph database.
Specifically, we use the approach:
- Use the last example file graph_kuzu_property_example.py as an example for Claude Sonnet 3.5 to understand the Kuzo Python APIs.
- Have Claude Sonnet 3.5 read the file data/economics.txt and create a schema for a new graph database and populate the schema from the contents of the file data/economics.txt.
- Ask Claude Sonnet 3.5 to also generate query examples.
Except for my adding the utility function query_and_print_result, this code was generated by Claude Sonnet 3.5:
1 """
2 Created by Claude Sonnet 3.5 from prompt:
3
4 Given some text, I want you to define Property graph schemas for
5 the information in the text. As context, here is some Python code
6 for defining two tables and a relation and querying the data:
7
8 [[CODE FROM graph_kuzu_property_example.py]]
9
10 NOW, HERE IS THE TEST TO CREATE SCHEME FOR, and to write code to
11 create nodes and links conforming to the scheme:
12
13 [[CONTENTS FROM FILE data/economics.txt]]
14
15 """
16
17 import kuzu
18
19 db = kuzu.Database("economics_db")
20 conn = kuzu.Connection(db)
21
22 # Node tables
23 conn.execute("""
24 CREATE NODE TABLE School (
25 name STRING,
26 description STRING,
27 PRIMARY KEY(name)
28 )""")
29
30 conn.execute("""
31 CREATE NODE TABLE Economist (
32 name STRING,
33 birthDate STRING,
34 PRIMARY KEY(name)
35 )""")
36
37 conn.execute("""
38 CREATE NODE TABLE Institution (
39 name STRING,
40 type STRING,
41 PRIMARY KEY(name)
42 )""")
43
44 conn.execute("""
45 CREATE NODE TABLE EconomicConcept (
46 name STRING,
47 description STRING,
48 PRIMARY KEY(name)
49 )""")
50
51 # Relationship tables
52 conn.execute("CREATE REL TABLE FoundedBy (FROM School TO Economist)")
53 conn.execute("CREATE REL TABLE TeachesAt (FROM Economist TO Institution)")
54 conn.execute("CREATE REL TABLE Studies (FROM School TO EconomicConcept)")
55
56 # Insert some data
57 conn.execute("CREATE (:School {name: 'Austrian School', description: 'School of economic thought emphasizing spontaneous organizing power of price mechanism'})")
58
59 # Create economists
60 conn.execute("CREATE (:Economist {name: 'Carl Menger', birthDate: 'Unknown'})")
61 conn.execute("CREATE (:Economist {name: 'Eugen von Böhm-Bawerk', birthDate: 'Unknown'})")
62 conn.execute("CREATE (:Economist {name: 'Ludwig von Mises', birthDate: 'Unknown'})")
63 conn.execute("CREATE (:Economist {name: 'Pauli Blendergast', birthDate: 'Unknown'})")
64
65 # Create institutions
66 conn.execute("CREATE (:Institution {name: 'University of Krampton Ohio', type: 'University'})")
67
68 # Create economic concepts
69 conn.execute("CREATE (:EconomicConcept {name: 'Microeconomics', description: 'Study of individual agents and markets'})")
70 conn.execute("CREATE (:EconomicConcept {name: 'Macroeconomics', description: 'Study of entire economy and issues affecting it'})")
71
72 # Create relationships
73 conn.execute("""
74 MATCH (s:School), (e:Economist)
75 WHERE s.name = 'Austrian School' AND e.name = 'Carl Menger'
76 CREATE (s)-[:FoundedBy]->(e)
77 """)
78
79 conn.execute("""
80 MATCH (s:School), (e:Economist)
81 WHERE s.name = 'Austrian School' AND e.name = 'Eugen von Böhm-Bawerk'
82 CREATE (s)-[:FoundedBy]->(e)
83 """)
84
85 conn.execute("""
86 MATCH (s:School), (e:Economist)
87 WHERE s.name = 'Austrian School' AND e.name = 'Ludwig von Mises'
88 CREATE (s)-[:FoundedBy]->(e)
89 """)
90
91 conn.execute("""
92 MATCH (e:Economist), (i:Institution)
93 WHERE e.name = 'Pauli Blendergast' AND i.name = 'University of Krampton Ohio'
94 CREATE (e)-[:TeachesAt]->(i)
95 """)
96
97 # Link school to concepts it studies
98 conn.execute("""
99 MATCH (s:School), (c:EconomicConcept)
100 WHERE s.name = 'Austrian School' AND c.name = 'Microeconomics'
101 CREATE (s)-[:Studies]->(c)
102 """)
103
104 """
105 Code written from the prompt:
106
107 Now that you have written code to create a sample graph database about
108 economics, you can write queries to extract information from the database.
109 """
110
111 def query_and_print_result(query):
112 """Basic pretty printer for Kuzu query results"""
113 print(f"\n* Processing: {query}")
114 result = conn.execute(query)
115 if not result:
116 print("No results found")
117 return
118
119 # Get column names
120 while result.has_next():
121 r = result.get_next()
122 print(r)
123
124 # 1. Find all founders of the Austrian School
125 query_and_print_result("""
126 MATCH (s:School)-[:FoundedBy]->(e:Economist)
127 WHERE s.name = 'Austrian School'
128 RETURN e.name
129 """)
130
131 # 2. Find where Pauli Blendergast teaches
132 query_and_print_result("""
133 MATCH (e:Economist)-[:TeachesAt]->(i:Institution)
134 WHERE e.name = 'Pauli Blendergast'
135 RETURN i.name, i.type
136 """)
137
138 # 3. Find all economic concepts studied by the Austrian School
139 query_and_print_result("""
140 MATCH (s:School)-[:Studies]->(c:EconomicConcept)
141 WHERE s.name = 'Austrian School'
142 RETURN c.name, c.description
143 """)
144
145 # 4. Find all economists and their institutions
146 query_and_print_result("""
147 MATCH (e:Economist)-[:TeachesAt]->(i:Institution)
148 RETURN e.name as Economist, i.name as Institution
149 """)
150
151 # 5. Find schools and count their founders
152 query_and_print_result("""
153 MATCH (s:School)-[:FoundedBy]->(e:Economist)
154 RETURN s.name as School, COUNT(e) as NumberOfFounders
155 """)
156
157 # 6. Find economists who both founded schools and teach at institutions
158 query_and_print_result("""
159 MATCH (s:School)-[:FoundedBy]->(e:Economist)-[:TeachesAt]->(i:Institution)
160 RETURN e.name as Economist, s.name as School, i.name as Institution
161 """)
162
163 # 7. Find economic concepts without any schools studying them
164 query_and_print_result("""
165 MATCH (c:EconomicConcept)
166 WHERE NOT EXISTS {
167 MATCH (s:School)-[:Studies]->(c)
168 }
169 RETURN c.name
170 """)
171
172 # 8. Find economists with no institutional affiliations
173 query_and_print_result("""
174 MATCH (e:Economist)
175 WHERE NOT EXISTS {
176 MATCH (e)-[:TeachesAt]->()
177 }
178 RETURN e.name
179 """)
How might you use this example? Using one or two shot prompting in LLM input prompts to specify data format and other information and then generating structured data of Python code is a common implementation pattern for using LLMs.
Here, the “structured data” I asked an LLM to output was Python code.
I cheated in this example by using what is currently the best code generation LLM: Claude Sonnet 3.5. I also tried this same exercise using Ollama with the model qwen2.5-coder:14b and the results were not quite as good. This is a great segway into the final chapter Book Wrap Up.
Using the Open Codex Command Line Interface Coding Agent
Open Codex is a fork of OpenAI’s codex project that can be used with most LLMs, local using Ollama as well as commercial APIs. Here we look at setting and using Open Codex with Ollama.
Using Open Codex with local Ollama models provides privacy when working with proprietary code bases.
OpenAI Codex CLI and the clone Open Codex are a lightweight, open-source coding agents that run locally in the terminal, integrating AI models with local code and computing tasks. Both tools support multimodal reasoning by allowing users to pass prompts to the model, enhancing its ability to understand and generate code based on various inputs. Both tools are minimal and transparent, providing straightforward interfaces for developers to interact with AI models directly from the command line.
The open-codex clone offers flexibility by allowing users to swap out models from different vendors easily. For instance, users can change the model by adjusting a simple command-line parameter, enabling the use of models like Claude or local models using Ollama. This adaptability makes it suitable for developers who wish to experiment with various AI models without being tied to a specific provider.
Both Codex CLI and open-codex aim to streamline the coding process by integrating AI capabilities directly into the developer’s workflow.
You need to have a recent version of npm installed. You can then install the command line tool for Open Codex globally on your laptop using:
1 npm i -g open-codex
I use the qwen2.5:14b with Ollama so make sure this model is installed locally:
1 ollama pull qwen2.5:14b
2 ollama serve
Create the directory ~/.codex if it does not already exist and edit the file ~/.codex/config.json to contain:
1 {
2 "provider": "ollama",
3 "model": "qwen2.5:14b",
4 }
Example Use Cases
I use Open Codex with Python, Common Lisp and Haskell projects.
As you run open-codex you may be asked questions about the tasks you requested and for permissions to create new files, commit changes to git, etc.
Explain the Codebase in the Current Directory
1 open-codex "Describe the code in this directory"
Modify Code and create New Source Files
1 open-codex "Copy the functions summarize and answer-question from ollamaOLD.lisp to ollama.lisp, making any required changes. Create a new test file test.lisp that contains examples of using the functions defined in ollama.lisp"
Open Codex Wrap Up
Using a common line agent like Open Codex provides a quick and intuitive way to interact with and modify code bases on your local computer system.
Long Term Persistence Using Mem0 and Chroma
Something important that we haven’t covered yet: building a persistent memory for LLM applications. Here we use two libraries:
- Mem0: persistent memory for AI Agents and LLM applications. GitHub: https://github.com/mem0ai/mem0.
- Chroma: AI-native open-source vector database that simplifies building LLM apps by providing tools for storing, embedding, and searching embeddings.
The example in this chapter is simple and can be copied and modified for multiple applications; for example:
- Code advice agent for Python
- Store thoughts and ideas
Code Example Using Mem0 and Chroma
This Python script demonstrates how to create a persistent memory for an AI assistant using the mem0ai library, chromadb for vector storage, and ollama for interacting with LLMs. Designed to be run repeatedly, each execution processes a user prompt, leverages past interactions stored in a local ChromaDB database, and then generates a concise, relevant response using a local Gemma model. The core idea is that the mem0ai library facilitates storing conversation snippets and retrieving them based on the semantic similarity to the current query. This retrieved context, referred to as “memories,” is then injected into the LLM’s system prompt, allowing the AI to maintain a coherent and context-aware conversation across multiple, independent runs. By persisting these memories locally, the system effectively builds a long-term conversational understanding, enabling the AI to recall and utilize previously discussed information to provide more informed and relevant answers over time, even when the script is executed as a fresh process each time.
The Chroma vector store database is stored under the file path ./db_local and until you delete this directory, memories of old interactions are maintained.
One parameter you may want to change is the number of memories matched in the Chroma database. This can be set in the line of code m.search(query=args.prompt, limit=5, …).
1 # Run this script repeatedly to build a persistent memory:
2 #
3 # python mem0_persistence.py "What color is the sky?"
4 # python mem0_persistence.py "What is the last color we talked about?"
5
6 # pip install mem0ai chromadb ollama
7
8 import argparse
9 from mem0 import Memory
10 from ollama import chat
11 from ollama import ChatResponse
12
13 USER_ID = "123"
14
15 config = {
16 "user_id": USER_ID,
17 "vector_store": {
18 "provider": "chroma",
19 "config": { "path": "db_local" }
20 },
21 "llm": {
22 "provider": "ollama",
23 "config": {
24 "model": "gemma3:4b-it-qat",
25 "temperature": 0.1,
26 "max_tokens": 5000
27 }
28 },
29 }
30
31 def call_ollama_chat(model: str, messages: list[dict]) -> str:
32 """
33 Send a chat request to Ollama and return the assistant's reply.
34 """
35 response: ChatResponse = chat(
36 model=model,
37 messages=messages
38 )
39 return response.message.content
40
41 def main():
42 p = argparse.ArgumentParser()
43 p.add_argument("prompt", help="Your question")
44 args = p.parse_args()
45
46 m = Memory.from_config(config)
47 print(f"User: {args.prompt}")
48
49 rel = m.search(query=args.prompt, limit=5, user_id=USER_ID)
50 mems = "\n".join(f"- {e['memory']}" for e in rel["results"])
51 print("Memories:\n", mems)
52
53 system0 = "You are a helpful assistant who answers with concise, short answers."
54 system = f"{system0}\nPrevious user memories:\n{mems}"
55
56 msgs = [
57 {"role":"system","content":system},
58 {"role":"user","content":args.prompt}
59 ]
60
61 reply = call_ollama_chat("gemma3:4b-it-qat", msgs)
62
63 convo = {"role":"assistant",
64 "content":
65 f"QUERY: {args.prompt}\n\nANSWER:\n{reply}\n"}
66 m.add(convo, user_id=USER_ID, infer=False)
67
68 print(f"\n\n** convo:\n{convo}\n\n")
69
70 print("Assistant:", reply)
71
72 if __name__=="__main__":
73 main()
In the line m.add(…) set infer=True if you want to use the configured LLM in Ollama to filter out inserts. I almost always set this to False to store all questions and answers in the Chroma vector database.
Example Output
The following output has been lightly edited, removing library deprecation warnings and extra blank lines in the output.
The first time we run the test script the vector database is empty so the user query “Name two Physical laws” does not match any previous memories stored in the Chroma vector database:
1 $ python mem0_persistence.py "Name two Physical laws"
2
3 User: Name two Physical laws
4 Memories:
5
6
7 ** convo:
8 {'role': 'assistant', 'content': "QUERY: Name two Physical laws\n\nANSWER:\n1. Newton's First Law\n2. Law of Conservation of Energy\n"}
9
10 Assistant: 1. Newton's First Law
11 2. Law of Conservation of Energy
Now the Chroma data store contains one memory:
1 $ python mem0_persistence.py "Name another different Physical law"
2
3 User: Name another different Physical law
4 Memories:
5 - QUERY: Name two Physical laws
6
7 ANSWER:
8 1. Newton's First Law
9 2. Law of Conservation of Energy
10
11 ** convo:
12 {'role': 'assistant', 'content': "QUERY: Name another different Physical law\n\nANSWER:\n1. Newton's Third Law\n"}
13
14 Assistant: 1. Newton's Third Law
Here we ask a question in a different subject domain:
1 $ python mem0_persistence.py "What color is the sky?"
2
3 User: What color is the sky?
4 Memories:
5 - QUERY: Name another different Physical law
6
7 ANSWER:
8 1. Newton's Third Law
9
10 - QUERY: Name two Physical laws
11
12 ANSWER:
13 1. Newton's First Law
14 2. Law of Conservation of Energy
15
16 ** convo:
17 {'role': 'assistant', 'content': 'QUERY: What color is the sky?\n\nANSWER:\nBlue.\n'}
18
19 Assistant: Blue.
We check persistence:
1 $ python mem0_persistence.py "What is the last color we talked about?"
2
3 User: What is the last color we talked about?
4 Memories:
5 - QUERY: What color is the sky?
6
7 ANSWER:
8 Blue.
9
10 - QUERY: Name two Physical laws
11
12 ANSWER:
13 1. Newton's First Law
14 2. Law of Conservation of Energy
15
16 - QUERY: Name another different Physical law
17
18 ANSWER:
19 1. Newton's Third Law
20
21 ** convo:
22 {'role': 'assistant', 'content': 'QUERY: What is the last color we talked about?\n\nANSWER:\nBlue.\n'}
23
24 Assistant: Blue.
Book Wrap Up
Dear reader, I have been paid for “AI work” (for many interpretations of what that even means) since 1982. I certainly find LLMs to be the most exciting tool for moving the field of AI further and faster than anything else that I have used in the last 43 years.
I am also keenly interested in privacy and open source so I must admit a strong bias towards using open source software, open weight LLMs, and also systems and infrastructure like Ollama that enable me to control my own data. The content of this book is tailored to my own interests but I hope that I have, dear reader, covered many of your interests also.
In the last example in the previous chapter I “pulled a fast one” in that I didn’t use a local model running with Ollama. Instead I used what is the most powerful commercial LLM Claude Sonnet 3.5 because it generates better code than any model that I can run on my Mac with 32B of consolidated memory using Ollama. In my work, I balance my personal desires for data privacy and control over the software and hardware I use, with practical compromises like using the state of the art models running on massive cloud compute resources.