Using a Local Document Embeddings Vector Database With OpenAI GPT-5 APIs for Semantically Querying Your Own Data

Note: Updated 10/11/2025 to use gpt-5-mini and new OpenAI library.

This project is inspired by the Python LangChain and LlamaIndex projects, with just the parts I need for my projects written from scratch in Common Lisp. I wrote a Python book “LangChain and LlamaIndex Projects Lab Book: Hooking Large Language Models Up to the Real World Using GPT-3, ChatGPT, and Hugging Face Models in Applications” in March 2023: https://leanpub.com/langchain that you might also be interested in.

The GitHub repository for this example can be found here: https://github.com/mark-watson/docs-qa. This code also requires my OpenAI Common Lisp library https://github.com/mark-watson/openai.

Overview of Local Embeddings Vector Database to Enhance the Use of GPT3 APIs With Local Documents

In this example we will use the SqLite database to store the text from documents as well as OpenAI embedding vectors for the text. Each embedding vector is 1536 floating point numbers. Two documents are semantically similar if the dot product of their embedding vectors is large.

For long documents, we extract the text and create multiple chunks of text. Each chunk is stored as a row in a SqLite database table. This is an easy way to implement a vector datastore. There are many open source and commercial vector data stores if you reach performance limits with the simple techniques we use here.

For each text chunk we call an OpenAI API to get an embedding vector. Later when we want to have a GPT enabled conversation or just semantically query our local documents, we take the user’s query and call an OpenAI API to get an embedding vector for the query text. We then compute the vector dot product between the query embedding vector and each chunk embedding vector. We save the text of the chunks that are semantically similar to the query embedding vector and use this text as “context text” that we pass to an OpenAI Large Language Model (LLM) API along with the user’s original query text.

What does this process really do? Normally when you query ChatGPT or similar LLMs, we are querying against knowledge gained from all the original model training text. This process can lead to so-called “model hallucinations” where the model “makes stuff up.” The advantage to the using the Python libraries LangChain and LlamaIndex is that a LLM is effectively using all original training data but is also primed with hopefully relevant context text from your local documents that might be useful for answering the user’s query. We will replicate a small amount of this functionality in Common Lisp.

At the end of this chapter we will extend our code for single queries with a conversational example. Our approach to this is simple: when we pass context text and a query, we also pass previous conversational queries from the user. I am still experimenting with the ideas in this chapter so please do occasionally look for updates to the GitHub repository https://github.com/mark-watson/docs-qa and updates to this book.

Implementing a Local Vector Database for Document Embeddings

In the following listing of the file docs-qa.lisp we start in lines 6-31 with a few string utility functions we will need: write-floats-to-string, read-file, concat-strings, truncate-string, and break-into-chunks.

The function break-into-chunks is a work in progress. For now we simply cut long input texts into specific chunk lengths, often cutting words in half. A future improvement will be detecting sentence boundaries and breaking text on sentences. The Python libraries LangChain and LlamaIndex have multiple chunking strategies.

In lines 33-37 function decode-row takes data from a SQL query to fetch a database table row and extracts the original chunk text and the embedding vector. Because of the overhead of making many calls to the OpenAI APIs the time spent running the local Common Lisp example code is very small so I have not yet worked on making my code efficient.

 1 (ql:quickload :sqlite)
 2 (use-package :sqlite)
 3 
 4 ;; define the environment variable "OPENAI_KEY" with the value of your OpenAI API key
 5 
 6 (defun write-floats-to-string (lst)
 7   (with-output-to-string (out)
 8     (format out "( ")
 9     (loop for i in lst
10           do (format out "~f " i))
11     (format out " )")))
12 
13 (defun read-file (infile) ;; from Bing+ChatGPT
14   (with-open-file (instream infile
15                             :direction :input
16                             :if-does-not-exist nil)
17     (when instream
18       (let ((string (make-string (file-length instream))))
19         (read-sequence string instream)
20         string))))
21 
22 (defun concat-strings (list)
23   (apply #'concatenate 'string list))
24 
25 (defun truncate-string (string length)
26   (subseq string 0 (min length (length string))))
27 
28 (defun break-into-chunks (text chunk-size)
29   "Breaks TEXT into chunks of size CHUNK-SIZE."
30   (loop for start from 0 below (length text) by chunk-size
31         collect (subseq text start (min (+ start chunk-size) (length text)))))
32 
33 (defun decode-row (row)
34   (let ((id (nth 0 row))
35         (context (nth 1 row))
36         (embedding (read-from-string (nth 2 row))))
37     (list id context embedding)))

The next listing shows of parts of docs-qa.lisp that contain code to use SqLite. I wrapped the calls to initialize the database inside of handler-case for convenience during development (file reloads don’t throw top level errors and the existing database is untouched).

 1 (defvar *db* (connect ":memory:"))
 2 ;;(defvar *db* (connect "test.db"))
 3 
 4 (pprint *db*)
 5 (handler-case
 6     (progn
 7       (execute-non-query
 8        *db*
 9        "CREATE TABLE documents (document_path TEXT, content TEXT, embedding TEXT);")
10       (execute-non-query
11        *db*
12        "CREATE INDEX idx_documents_id ON documents (document_path);")
13       (execute-non-query
14        *db*
15        "CREATE INDEX idx_documents_content ON documents (content);")
16       (execute-non-query
17        *db*
18        "CREATE INDEX idx_documents_embedding ON documents (embedding);"))
19  (error (c)
20    (print "Database and indices is already created")))
21 
22 (defun insert-document (document_path content embedding)
23   ;;(format t "insert-document:~% content:~A~%  embedding: ~A~%" content embedding)
24   (format t "~%insert-document:~%  content:~A~%~%" content)
25   (execute-non-query
26    *db*
27    "INSERT INTO documents (document_path, content, embedding) VALUES (?, ?, ?);"
28    document_path content (write-floats-to-string embedding)))
29 
30 (defun get-document-by-document_path (document_path)
31   (mapcar #'decode-row
32             (execute-to-list *db*
33                              "SELECT * FROM documents WHERE document_path = ?;"
34                              document_path)))
35 
36 (defun get-document-by-content (content)
37   (mapcar #'decode-row 
38     (execute-to-list *db*
39                      "SELECT * FROM documents WHERE content LIKE ?;" content)))
40 
41 (defun get-document-by-embedding (embedding)
42  (mapcar #'decode-row 
43    (execute-to-list *db*
44                     "SELECT * FROM documents WHERE embedding LIKE ?;" embedding)))
45 
46 (defun all-documents ()
47   (mapcar #'decode-row 
48     (execute-to-list *db* "SELECT * FROM documents;")))
49 
50 (defun create-document (fpath)
51   (let ((contents (break-into-chunks (read-file fpath) 200)))
52     (dolist (content contents)
53       (handler-case   
54       (let ((embedding (openai::embeddings content)))
55         (insert-document fpath content embedding))
56     (error (c)
57            (format t "Error: ~&~a~%" c))))))

Using Local Embeddings Vector Database With OpenAI GPT APIs

The next listing showing of parts of docs-qa.lisp interfaces with the OpenAI APIs:

 1 (defun qa (question)
 2   (let ((answer (openai:answer-question question)))
 3     (format t "~&~a~%" answer)))
 4 
 5 (defun semantic-match (query custom-context &optional (cutoff 0.7))
 6   (let ((emb (openai::embeddings query))
 7         (ret))
 8     (dolist (doc (all-documents))
 9       (let ((context (nth 1 doc)) ;; ignore fpath for now
10         (embedding (nth 2 doc)))
11     (let ((score (openai::dot-product emb embedding)))
12       (when (> score cutoff)
13         (push context ret)))))
14     (format t "~%semantic-search: ret=~A~%" ret)
15     (let* ((context (join-strings " . " (reverse ret)))
16            (query-with-context
17              (join-strings
18                " "
19                (list context custom-context
20                  "Question:" query))))
21       (openai:answer-question query-with-context))))
22 
23 (defun QA (query &optional (quiet nil))
24   (let ((answer (semantic-match query "")))
25     (if (not quiet)
26         (format t "~%~%** query: ~A~%** answer: ~A~%~%" query answer))
27     answer))

Testing Local Embeddings Vector Database With OpenAI GPT APIs

In the next part of the listing of docs-qa.lisp we write a test function to create two documents. The two calls to create-document actually save text and embeddings for about 20 text chunks in the database.

1 (defun test()
2   "Test code for Semantic Document Search Using
3    OpenAI GPT APIs and local vector database"
4   (create-document "data/sports.txt")
5   (create-document "data/chemistry.txt")
6   (QA "What is the history of the science of chemistry?")
7   (QA "What are the advantages of engainging in sports?"))

The output is (with a lot of debug printout not shown):

 1 $ sbcl
 2 * (quicklisp:quickload :docs-qa)
 3 To load "docs-qa":
 4   Load 1 ASDF system:
 5     docs-qa
 6 ; Loading "docs-qa"
 7 ..................................................
 8 [package docs-qa]To load "sqlite":
 9   Load 1 ASDF system:
10     sqlite
11 ; Loading "sqlite"
12 
13 #<sqlite-handle {7005CA3783}>
14 (:docs-qa)
15 * (in-package :docs-qa)
16 #<package "DOCS-QA">
17 * (test)
18 
19 ** query: What is the history of the science of chemistry?
20 ** answer: The history of chemistry as a science began in the 6th century BC, when the Greek philosopher Leucippus and his student Democritus posited the existence of an endless number of worlds
21 
22 ** query: What are the advantages of engainging in sports?
23 ** answer: The advantages of engaging in sports are:n1. It helps to develop the body and mind.n2. It helps to develop the character.n3. It helps to develop the personality.

Adding Chat History

In the last part of the listing of docs-qa.lisp we experiment with supporting a conversation/chat of multiple semantic queries against our local documents.

 1 (defun CHAT ()
 2   (let ((messages '(""))
 3         (responses '("")))
 4     (loop
 5        (format t "~%Enter chat (STOP or empty line to stop) >> ")
 6        (let ((string (read-line))
 7              response)
 8          (cond ((or (string= string "STOP") (< (length string) 1)) (return))
 9                (t (let (prompt
10                         custom-context)
11                     (setf custom-context
12                           (concatenate
13                            'string
14                            "PREVIOUS CHAT: "
15                            (join-strings  " "
16                                           (reverse messages))))
17                     (push string messages)
18                     (print messages) ;; (print responses)
19                     (print prompt)
20                     (setf response (semantic-match string custom-context))
21                     (push response responses)
22                     (format t "~%Response: ~A~%" response))))))
23     (list (reverse messages) (reverse responses))))

The output (with lots of debug printouts removed) looks like:

 1 $ sbcl
 2 * (quicklisp:quickload :docs-qa)
 3 To load "docs-qa":
 4   Load 1 ASDF system:
 5     docs-qa
 6 ; Loading "docs-qa"
 7 ..................................................
 8 [package docs-qa].To load "sqlite":
 9   Load 1 ASDF system:
10     sqlite
11 ; Loading "sqlite"
12 #<sqlite-handle {7005D9B9D3}>
13 * (in-package :docs-qa)
14 #<package "DOCS-QA">
15 * (create-document "data/chemistry.txt")
16 
17 insert-document:
18   content:Amyl alcohol is an organic compound with the formula C 5 H 12 O. All eight isomers of amyl alcohol are known. The most important is isobutyl carbinol, this being the chief constituent of fermentation 
19  ;; output from all other document chunks is not shown
20  
21  * (CHAT)
22 
23 Enter chat (STOP or empty line to stop) >> what is the history of chemistry?
24 
25 Response: Chemistry is the science of matter, its composition, structure and its properties. Chemistry is concerned with atoms and their interactions with other atoms, and thus is central to all other sciences. Chemistry is also concerned
26 
27 Enter chat (STOP or empty line to stop) >> what is the boiling temperature?
28 
29 Response: The boiling temperature of a liquid is the temperature at which the vapor pressure of the liquid equals the pressure surrounding the liquid, and the liquid changes into a vapor. At the boiling temperature, bubbles of vapor
30 
31 Enter chat (STOP or empty line to stop) >> 

Wrap Up for Using Local Embeddings Vector Database to Enhance the Use of GPT5 APIs With Local Documents

As I wrote the first version of this chapter in early April 2023, I have been working almost exclusively with OpenAI APIs for the last year and using the Python libraries for LangChain and LlamaIndex for the last three months.

I prefer using Common Lisp over Python when I can, so I am implementing a tiny subset of the LangChain and LlamaIndex libraries in Common Lisp for my own use. By writing about my Common Lisp experiments here I hope that I get pull requests for https://github.com/mark-watson/docs-qa from readers who are interested in helping to extend the Common Lisp library.