Leanpub: Publish Early, Publish Often

Using the OpenAI APIs

I have been working as an artificial intelligence practitioner since 1982 and the capability of the beta OpenAI APIs is the most impressive thing that I have seen (so far!) in my career. These APIs use the GPT-4 model.

I recommend reading the online documentation for the APIs to see all the capabilities of the OpenAI APIs.

Let’s start by jumping into the example code.

The library that I wrote for this chapter supports three functions: completing text, summarizing text, and answering general questions. The single OpenAI model that the OpenAI APIs use is fairly general purpose and can perform tasks like:

Generate cooking directions when given an ingredient list.
Grammar correction.
Write an advertisement from a product description.
Generate spreadsheet data from data descriptions in English text.

Given the examples from https://platform.openai.com (will require you to login) and the Clojure examples here, you should be able to modify my example code to use any of the functionality that OpenAI documents.

We will look closely at the function completions and then just look at the small differences to the other two example functions. The definitions for all three exported functions are kept in the file src/openai_api/core.clj*. You need to request an API key (I had to wait a few weeks to receive my key) and set the value of the environment variable OPENAI_KEY to your key. You can add a statement like:

export OPENAI_API_KEY=sa-hdffds7&dhdhsdgffd

to your .profile or other shell resource file. Here the API token “sa-hdffds7&dhdhsdgffd” is made up - that is not my API token.

When experimenting with OpenAI APIs it is often start by using the curl utility. An example curl command line call to the beta OpenAI APIs is (note: this CURL example uses an earlier API):

 1 curl https://api.openai.com/v1/chat/completions -H "Content-Type: application/json" \
 2 -H "Authorization: Bearer $OPENAI_API_KEY"   -d '{
 3     "model": "gpt-3.5-turbo",
 4     "messages": [
 5       {
 6         "role": "system",
 7         "content": "You are an assistant, skilled in explaining complex programming \
 8 and other technical problems."
 9       },
10       {
11         "role": "user",
12         "content": "Write a Python function foo to add two argument"
13       }
14     ]
15   }'

Output might look like this:

 1 {
 2   "id": "chatcmpl-8nqUrlNsCPQgUkSIjW7ytvN5GlH3C",
 3   "object": "chat.completion",
 4   "created": 1706890561,
 5   "model": "gpt-3.5-turbo-0613",
 6   "choices": [
 7     {
 8       "index": 0,
 9       "message": {
10         "role": "assistant",
11         "content": "Certainly! Here is a Python function named `foo` that takes two \
12 arguments `a` and `b` and returns their sum:\n\n```python\ndef foo(a, b):\n    retur\
13 n a + b\n```\n\nTo use this function, simply call it and pass in two arguments:\n\n`\
14 ``python\nresult = foo(3, 5)\nprint(result)  # Output: 8\n```\n\nIn this example, `r\
15 esult` will store the sum of `3` and `5`, which is `8`. You can change the arguments\
16  `a` and `b` to any other numbers to get different results."
17       },
18       "logprobs": null,
19       "finish_reason": "stop"
20     }
21   ],
22   "usage": {
23     "prompt_tokens": 35,
24     "completion_tokens": 127,
25     "total_tokens": 162
26   },
27   "system_fingerprint": null
28 }

All of the OpenAI APIs expect JSON data with query parameters. To use the completion API, we set values for prompt. We will look at several examples later.

The file src/openai_api/core.clj contains the implementation of our wrapper library using Werner Kok’s library:

 1 (ns openai-api.core
 2   (:require
 3    [wkok.openai-clojure.api :as api])
 4   (:require [clj-http.client :as client])
 5   (:require [clojure.data.json :as json]))
 6 
 7 (def model2 "gpt-4o-mini")
 8 
 9 (def api-key (System/getenv "OPENAI_API_KEY"))
10 
11 (defn completions [prompt]
12   (let [url "https://api.openai.com/v1/chat/completions"
13         headers {"Authorization" (str "Bearer " api-key)
14                  "Content-Type" "application/json"}
15         body {:model model2
16               :messages [{:role "user" :content prompt}]}
17         response (client/post url {:headers headers
18                                    :body (json/write-str body)})]
19     ;;(println (:body response))
20     (get
21      (get
22       (first
23        (get
24         (json/read-str (:body response)  :key-fn keyword)
25         :choices))
26       :message)
27      :content)))
28 
29 (defn summarize [text]
30   (completions (str "Summarize the following text:\n\n" text)))
31 
32 (defn embeddings [text]
33   (try
34     (let* [body
35            (str
36             "{\"input\": \""
37             (clojure.string/replace
38              (clojure.string/replace text #"[\" \n :]" " ")
39              #"\s+" " ")
40             "\", \"model\": \"text-embedding-ada-002\"}")
41            json-results
42            (client/post
43             "https://api.openai.com/v1/embeddings"
44             {:accept :json
45              :headers
46              {"Content-Type"  "application/json"
47               "Authorization" (str "Bearer " api-key)}
48              :body   body})]
49           ((first ((json/read-str (json-results :body)) "data")) "embedding"))
50     (catch Exception e
51       (println "Error:" (.getMessage e))
52       "")))
53 
54 (defn dot-product [a b]
55   (reduce + (map * a b)))

Note that the OpenAI models are stochastic. When generating output words (or tokens), the model assigns probabilities to possible words to generate and samples a word using these probabilities. As a simple example, suppose given prompt text “it fell and”, then the model could only generate three words, with probabilities for each word based on this prompt text:

the 0.9
that 0.1
a 0.1

The model would emit the word the 90% of the time, the word that 10% of the time, or the word a 10% of the time. As a result, the model can generate different completion text for the same text prompt. Let’s look at some examples using the same prompt text. Notice the stochastic nature of the returned results:

 1 $ lein repl
 2 openai-api.core=> (openai-api.core/completions "He walked to the river")
 3 " and breathed in the new day, looking out to the lake where the Mire was displacing\
 4  the Wold by its"
 5 openai-api.core=> (openai-api.core/completions "He walked to the river")
 6 ". He waded in, not caring about his expensive suit pants. He was going to do this r\
 7 ight, even if"
 8 openai-api.core=> (openai-api.core/completions "He walked to the river")
 9 " every day. The salty air puffed through their pores. He had enjoyed her company. M\
10 aybe he did need a companion"

The function summarize is very similar to the function completions except I changed the system prompt string. Here is some example output:

 1 openai-api.core=> (def some-text
 2              #_=>   "Jupiter is the fifth planet from the Sun and the largest in the\
 3  Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two\
 4 -and-a-half times that of all the other planets in the Solar System combined. Jupite\
 5 r is one of the brightest objects visible to the naked eye in the night sky, and has\
 6  been known to ancient civilizations since before recorded history. It is named afte\
 7 r the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough fo\
 8 r its reflected light to cast visible shadows,[20] and is on average the third-brigh\
 9 test natural object in the night sky after the Moon and Venus.")
10 #'openai-api.core/some-text
11 
12 openai-api.core=> (openai-api.core/summarize some-text openai-api.core=> (openai-api\
13 .core/summarize some-text)
14 "Jupiter is classified as a gas giant along with Saturn, Uranus, and Neptune. Jupite\
15 r is composed primarily of gaseous and liquid matter.[21] It is the largest of the f\
16 our giant planets in the Solar System and hence its largest planet. It has a diamete\
17 r of 142,984 km at its equator, which is 0.11 times the diameter of Earth. Jupiter i\
18 s a gas giant because the mass of the planet"

In addition to reading the OpenAI API documentation you might want to read general material on the use of OpenAI’s GPT-4 model.

Up next

Using the Google Gemini APIs