Leanpub: Publish Early, Publish Often

Datomic Local: Clojure wrapper for Immutable Datalog

In the previous chapters we used vector databases for storing embeddings and we used the Apache Jena library for working with RDF triples and SPARQL queries. In this chapter we’ll explore another approach to building knowledge graphs: Datomic Local, a free embedded Datalog database from Cognitect.

Datomic is an immutable database. Data is never updated or deleted in place — each transaction creates new datoms (facts) that are appended to the transaction log. This append-only model has several properties that make it a natural fit for AI applications:

Time travel: Query the database as it existed at any past transaction
Audit trails: See every assertion and retraction across time
Reproducibility: Queries against a specific database snapshot always return the same results
Fact accumulation: Add facts incrementally without losing history — ideal for agent memory and knowledge bases

Datomic Local runs in-process with no separate server or transactor, making it ideal for prototyping knowledge-base and AI applications. It uses the Datomic client API (datomic.client.api) and stores data on the local filesystem or in memory. It is free (Apache 2.0 license) and requires no license key.

Why Datalog for AI?

Datomic’s query language is Datalog, a declarative logic programming language. Unlike SQL, Datalog queries are built from patterns of clauses that the query engine unifies against the database. This pattern-matching style is a natural fit for querying knowledge graphs:

Entity-attribute-value model: Datomic’s data model is EAV — every fact is [entity attribute value] — which maps directly to RDF’s subject-predicate-object triples
Reference types: Attributes can be typed as :db.type/ref, creating graph edges between entities
Joins are implicit: In Datalog, joining across entities is done by reusing the same variable in multiple clauses, not by explicit JOIN syntax
Pull API: Navigate entity relationships declaratively, following refs to arbitrary depth
Rules: Datalog supports user-defined rules for recursive queries and inference

Let’s jump into the code.

Project Setup

The example project is in source-code/datomic_local. Dependencies are minimal — just Clojure and the Datomic Local library:

1 ;; deps.edn
2 {:paths ["src" "test"]
3  :deps {org.clojure/clojure {:mvn/version "1.11.1"}
4         com.datomic/local {:mvn/version "1.0.291"}}}
5 
6 ;; project.clj
7 (defproject datomic-local "0.1.0-SNAPSHOT"
8   :dependencies [[org.clojure/clojure "1.11.1"]
9                  [com.datomic/local "1.0.291"]])

You can use either Leiningen (recommended for this project) or the Clojure CLI:

1 lein test      # run all tests
2 lein repl      # start a REPL
3 
4 clj -M:test    # Clojure CLI
5 clj            # start a REPL

A Thin Wrapper

The file src/datomic_local/core.clj provides a thin wrapper around datomic.client.api. The wrapper reduces boilerplate by handling the arg-map wrapping that the underlying API requires, but it doesn’t hide the Datomic concepts — you can use datomic.client.api directly if you prefer. The wrapper is purely for convenience.

Here’s the namespace declaration and the client creation functions:

 1 (ns datomic-local.core
 2   "Thin wrapper around Datomic Local (com.datomic/local), the free embedded
 3    Datalog database."
 4   (:require [datomic.client.api :as d]))
 5 
 6 (defn client
 7   "Create a Datomic Local client."
 8   ([arg]
 9    (if (map? arg)
10      (d/client arg)
11      (d/client {:server-type :datomic-local :system arg})))
12   ([system & {:keys [storage-dir]}]
13    (let [config (cond-> {:server-type :datomic-local :system system}
14                   storage-dir (assoc :storage-dir storage-dir))]
15      (d/client config))))

The client function is flexible — you can pass a full config map, or just a system name string:

 1 ;; Full config (in-memory, ideal for tests and REPL experimentation)
 2 (def client (d/client {:server-type :dev-local
 3                        :storage-dir :mem
 4                        :system "my-system"}))
 5 
 6 ;; Short form (persistent storage under ~/.datomic/)
 7 (def client (d/client "my-system"))
 8 
 9 ;; Short form with custom storage directory
10 (def client (d/client "my-system" :storage-dir "/data/datomic"))

For persistent storage, data is stored on disk under ~/.datomic/${system}/${db-name}/ and survives JVM restarts. For development and testing, :storage-dir :mem keeps everything in memory.

Database Lifecycle

Once you have a client, you create databases, connect to them, and manage the lifecycle:

 1 (defn create-database [client db-name]
 2   (d/create-database client {:db-name db-name}))
 3 
 4 (defn connect [client db-name]
 5   (d/connect client {:db-name db-name}))
 6 
 7 (defn delete-database [client db-name]
 8   (d/delete-database client {:db-name db-name}))
 9 
10 (defn list-databases [client]
11   (d/list-databases client {}))

There’s also a convenience function create-db that combines client creation, database creation, and connection in one call:

1 (defn create-db
2   ([db-name] (create-db "default" db-name))
3   ([system db-name]
4    (let [c (client system)]
5      (d/create-database c {:db-name db-name})
6      (d/connect c {:db-name db-name}))))

Transactions

All writes to Datomic happen through transactions. A transaction is a vector of entity maps or operation vectors that is committed atomically:

1 (defn transact [conn tx-data]
2   (d/transact conn {:tx-data tx-data}))

The transaction returns a result map synchronously:

1 {:db-before <db-value>   ; database snapshot before the transaction
2  :db-after  <db-value>   ; database snapshot after the transaction
3  :tx-data   [<datoms>]   ; datoms produced by the transaction
4  :tempids   {<str> <id>} ; resolved temporary IDs -> real entity IDs}

One of Datomic’s most useful features is string tempids. When you need to create entities that reference each other within the same transaction, use {:db/id "some-string"} as a placeholder:

1 (let [tx (d/transact conn
2            [{:db/id "alice" :person/name "Alice"}
3             {:db/id "bob"   :person/name "Bob" :person/friend ["alice"]}])]
4   (println "Alice's entity ID:" (get (:tempids tx) "alice"))
5   (println "Bob's entity ID:"   (get (:tempids tx) "bob")))

After the transaction commits, the :tempids map gives you the real numeric entity IDs.

Schema Definition

In Datomic, every attribute you use must be defined in the schema before you can assert values for it. Each attribute definition specifies the attribute’s identity, value type, cardinality, and optional properties:

1 (defn define-schema [conn schema]
2   (d/transact conn {:tx-data schema}))

The define-schema function is just transact with a more descriptive name — attribute definitions are themselves just entity data.

Here’s the movie schema we use throughout the examples:

 1 (def movie-schema
 2   [{:db/ident       :movie/title
 3     :db/valueType   :db.type/string
 4     :db/cardinality :db.cardinality/one
 5     :db/unique      :db.unique/identity
 6     :db/doc         "Movie title (unique)"}
 7 
 8    {:db/ident       :movie/year
 9     :db/valueType   :db.type/long
10     :db/cardinality :db.cardinality/one
11     :db/doc         "Release year"}
12 
13    {:db/ident       :movie/genre
14     :db/valueType   :db.type/string
15     :db/cardinality :db.cardinality/one
16     :db/doc         "Genre"}
17 
18    {:db/ident       :movie/director
19     :db/valueType   :db.type/ref
20     :db/cardinality :db.cardinality/one
21     :db/doc         "Director -- ref to :person entity"}
22 
23    {:db/ident       :movie/cast
24     :db/valueType   :db.type/ref
25     :db/cardinality :db.cardinality/many
26     :db/doc         "Cast members -- refs to :person entities"}
27 
28    {:db/ident       :person/name
29     :db/valueType   :db.type/string
30     :db/cardinality :db.cardinality/one
31     :db/doc         "Person's name"}])

Key points about schema design:

:db.unique/identity — attempting to assert the same value for a different entity triggers an upsert: Datomic resolves the existing entity and merges the new attributes into it. Best for natural keys like titles, emails, usernames.
:db.unique/value — attempting to assert the same value throws an error. Best for enforcing uniqueness without upsert semantics.
:db.cardinality/one — each entity has at most one value. Asserting a new value retracts the old one.
:db.cardinality/many — each entity can have multiple values. Asserting a new value adds to the set.
:db.type/ref — creates a reference to another entity, forming graph edges. Use :db.cardinality/many with :db.type/ref for to-many relationships (like a movie’s cast).
:db/index true — adds an AVET index for faster lookups on attributes you’ll frequently query.

Queries

The query function is the heart of working with Datomic:

1 (defn q [query & args]
2   (apply d/q query args))

Datalog queries are EDN data structures with :find, optional :in, and :where clauses. Let’s build up from simple queries to more complex ones.

Basic Queries

After inserting some movie entities, we can find all titles:

 1 (d/transact conn
 2   [{:movie/title "The Matrix"   :movie/year 1999 :movie/genre "Sci-Fi"}
 3    {:movie/title "Inception"    :movie/year 2010 :movie/genre "Sci-Fi"}
 4    {:movie/title "The Godfather" :movie/year 1972 :movie/genre "Crime"}
 5    {:movie/title "Pulp Fiction" :movie/year 1994 :movie/genre "Crime"}])
 6 
 7 (let [db (d/db conn)]
 8   (d/q '[:find ?title
 9          :where [?e :movie/title ?title]]
10        db))
11 ;; => #{["Pulp Fiction"] ["The Godfather"] ["The Matrix"] ["Inception"]}

The :where clause [?e :movie/title ?title] is a pattern that matches any entity ?e that has a :movie/title attribute with value ?title. The query engine finds all satisfying combinations of variables.

Multiple clauses in a :where act as implicit AND:

1 ;; Find Sci-Fi movies with their release years
2 (d/q '[:find ?title ?year
3        :where [?e :movie/title ?title]
4               [?e :movie/year ?year]
5               [?e :movie/genre "Sci-Fi"]]
6      db)
7 ;; => #{["The Matrix" 1999] ["Inception" 2010]}

Notice how ?e appears in all three clauses — this is how Datalog expresses joins. The same variable in multiple patterns constrains them to refer to the same entity.

Predicate Functions

You can use Clojure functions in Datalog queries to filter results:

1 ;; Movies released after 2000
2 (d/q '[:find ?title
3        :where [?e :movie/title ?title]
4               [?e :movie/year ?year]
5               [(> ?year 2000)]]
6      db)
7 ;; => #{["Inception"]}

The expression [(> ?year 2000)] calls the Clojure > function with the bound value of ?year. Any Clojure function can be used as a predicate — the clause succeeds when the function returns truthy.

Parameterized Queries

The :in clause lets you parameterize queries, making them reusable across different inputs:

 1 ;; Scalar parameter: find movies from a specific year
 2 (d/q '[:find ?title
 3        :in $ ?year
 4        :where [?e :movie/year ?year]
 5               [?e :movie/title ?title]]
 6      db 1999)
 7 ;; => #{["The Matrix"]}
 8 
 9 ;; Collection parameter: find movies matching any of several genres
10 (d/q '[:find ?title
11        :in $ [?genre ...]
12        :where [?e :movie/genre ?genre]
13               [?e :movie/title ?title]]
14      db ["Sci-Fi" "Crime"])
15 ;; => #{["The Matrix"] ["Inception"] ["The Godfather"] ["Pulp Fiction"]}
16 
17 ;; Tuple parameter: find movies matching year-genre pairs
18 (d/q '[:find ?title
19        :in $ [[?year ?genre]]
20        :where [?e :movie/year ?year]
21               [?e :movie/genre ?genre]
22               [?e :movie/title ?title]]
23      db [[1999 "Sci-Fi"] [1972 "Crime"]])
24 ;; => #{["The Matrix"] ["The Godfather"]}

The $ in :in refers to the database. Additional parameters after the database value correspond to additional :in bindings.

Query Result Shapes

Datomic Local’s client API only supports find-rel (tuple return). Here’s how different query shapes map to results:

Query `:find`	Returns	Extract with
`:find ?a`	`#{["val1"] ["val2"]}`	`(set (map first result))`
`:find ?a ?b`	`#{["a1" "b1"] ["a2" "b2"]}`	Direct set containment
`:find (count ?e)`	`#{[4]}`	`(ffirst result)`
`:find ?a (count ?e)`	`#{["Sci-Fi" 2] ["Crime" 2]}`	Direct set access
`:find (pull ?e [:*])`	`#{[{...}]}`	`(ffirst result)`

The peer-library syntax for scalar return (:find ?a .) and collection binding (:find [?a ...]) is not available in the client API, but ffirst and (map first result) handle the same needs.

Aggregates

Datalog supports aggregation functions including count, min, max, sum, and avg:

 1 ;; Total movie count
 2 (ffirst (d/q '[:find (count ?e)
 3                :where [?e :movie/title]]
 4              db))
 5 ;; => 4
 6 
 7 ;; Count per genre
 8 (d/q '[:find ?genre (count ?e)
 9        :where [?e :movie/genre ?genre]]
10      db)
11 ;; => #{["Sci-Fi" 2] ["Crime" 2]}
12 
13 ;; Earliest and latest release years
14 (ffirst (d/q '[:find (min ?year)
15                :where [?e :movie/year ?year]]
16              db))
17 ;; => 1972
18 
19 (ffirst (d/q '[:find (max ?year)
20                :where [?e :movie/year ?year]]
21              db))
22 ;; => 2010

Entity Relationships and Nested Pull

The Pull API lets you navigate entity relationships declaratively:

1 (defn pull [db selector eid]
2   (d/pull db selector eid))

Here’s a complete example that creates people and movies with cross-references, then pulls nested data:

 1 ;; Create entities with string tempids for cross-referencing
 2 (let [tx (d/transact conn
 3            [{:db/id "wachowski"  :person/name "Lana Wachowski"}
 4             {:db/id "nolan"      :person/name "Christopher Nolan"}
 5             {:db/id "reeves"     :person/name "Keanu Reeves"}
 6             {:db/id "dicaprio"   :person/name "Leonardo DiCaprio"}
 7 
 8             {:movie/title    "The Matrix"
 9              :movie/year     1999
10              :movie/genre    "Sci-Fi"
11              :movie/director "wachowski"
12              :movie/cast     ["reeves"]}
13 
14             {:movie/title    "Inception"
15              :movie/year     2010
16              :movie/genre    "Sci-Fi"
17              :movie/director "nolan"
18              :movie/cast     ["dicaprio"]}])
19       tempids (:tempids tx)
20       db (d/db conn)]
21 
22   ;; Find entity ID for The Matrix
23   (let [matrix-eid (ffirst (d/q '[:find ?e
24                                    :where [?e :movie/title "The Matrix"]]
25                                  db))]
26 
27     ;; Pull nested data: movie + director + cast
28     (d/pull db
29             [:movie/title :movie/year
30              {:movie/director [:person/name]}
31              {:movie/cast [:person/name]}]
32             matrix-eid)))
33 ;; => {:movie/title "The Matrix",
34 ;;     :movie/year 1999,
35 ;;     :movie/director {:person/name "Lana Wachowski"},
36 ;;     :movie/cast [{:person/name "Keanu Reeves"}]}

The pull selector [:movie/title :movie/year {:movie/director [:person/name]} {:movie/cast [:person/name]}] says: give me the title and year, and for the director and cast refs, follow them and pull the person’s name. This declarative approach to entity navigation is one of Datomic’s most powerful features.

Pull selectors support a rich set of patterns:

Selector	Description
`'[*]`	All attributes (refs show as `{:db/id N}`)
`:attr-name`	Single attribute value
`[:attr1 :attr2]`	Multiple attributes
`{:attr [:sub-attr]}`	Follow a ref and pull sub-attributes
`{:attr [:*]}`	Follow a ref and pull all sub-attributes
`'[(:attr :default)]`	Default value if attribute is missing
`'[(:attr :as :alias)]`	Rename attribute in result

You can also use pull expressions inside queries:

1 (d/q '[:find (pull ?e [:movie/title :movie/year
2                         {:movie/director [:person/name]}])
3        :where [?e :movie/title]]
4      db)
5 ;; => #{[{:movie/title "Inception",
6 ;;        :movie/year 2010,
7 ;;        :movie/director {:person/name "Christopher Nolan"}}]}

Join Queries

Because Datalog uses variables for implicit joins, queries that traverse relationships are concise. To find all movies by Christopher Nolan:

1 (d/q '[:find ?title
2        :where [?p :person/name "Christopher Nolan"]
3               [?m :movie/director ?p]
4               [?m :movie/title ?title]]
5      db)
6 ;; => #{["Inception"]}

The variable ?p is bound to the entity representing Christopher Nolan (via :person/name), then ?m is constrained to be an entity whose :movie/director is ?p, and finally we extract that movie’s title. This is a two-hop join expressed in three clauses, with no explicit JOIN syntax.

To find all actor-movie pairs:

1 (d/q '[:find ?actor-name ?movie-title
2        :where [?p :person/name ?actor-name]
3               [?m :movie/cast ?p]
4               [?m :movie/title ?movie-title]]
5      db)
6 ;; => #{["Keanu Reeves" "The Matrix"] ["Leonardo DiCaprio" "Inception"]}

Upsert with Unique Identity

When an attribute has :db/unique/identity, asserting an entity with a matching value resolves the existing entity rather than creating a new one. This is called upsert:

 1 ;; First insert
 2 (d/transact conn
 3   [{:movie/title "The Matrix" :movie/year 1999 :movie/genre "Sci-Fi"}])
 4 
 5 ;; Upsert: same title resolves existing entity, genre is updated
 6 (d/transact conn
 7   [{:movie/title "The Matrix" :movie/genre "Action"}])
 8 
 9 ;; Verify: one entity, genre is now "Action"
10 (let [db (d/db conn)
11       eid (ffirst (d/q '[:find ?e
12                           :where [?e :movie/title "The Matrix"]]
13                         db))]
14   (:movie/genre (d/pull db '[:movie/genre] eid)))
15 ;; => "Action"

Because :movie/genre has :db.cardinality/one, asserting a new value retracts the old one. If it had :db.cardinality/many, the new value would be added to the set instead.

Transaction Operations

Beyond entity maps, transactions support direct operations:

 1 ;; Add an attribute value to an existing entity
 2 [:db/add entity-id :attribute value]
 3 
 4 ;; Retract an attribute value
 5 [:db/retract entity-id :attribute value]
 6 
 7 ;; Retract an entire entity
 8 [:db/retractEntity entity-id]
 9 
10 ;; Compare-and-swap (only transact if attribute has expected value)
11 [:db/cas entity-id :attribute old-value new-value]

Here’s an example that retracts an existing value and adds a new one:

1 (let [db (d/db conn)
2       eid (ffirst (d/q '[:find ?e
3                           :where [?e :movie/title "Eraserhead"]]
4                         db))]
5   ;; Retract the original genre
6   (d/transact conn [[:db/retract eid :movie/genre "Surrealist"]])
7   ;; Add a new genre
8   (d/transact conn [[:db/add eid :movie/genre "Experimental"]]))

Time Travel: Querying Past States

Datomic’s immutability means every transaction is preserved. You can query the database as it existed at any point in the past:

1 (defn as-of [db time-point]
2   (d/as-of db time-point))
3 
4 (defn since [db time-point]
5   (d/since db time-point))
6 
7 (defn history [db]
8   (d/history db))

Here’s a time travel example:

 1 ;; Day 1: add Jaws
 2 (let [tx1 (d/transact conn
 3             [{:movie/title "Jaws" :movie/year 1975 :movie/genre "Thriller"}])]
 4   
 5   ;; Day 2: add Star Wars
 6   (d/transact conn
 7     [{:movie/title "Star Wars" :movie/year 1977 :movie/genre "Sci-Fi"}])
 8   
 9   ;; Current state: both movies exist
10   (ffirst (d/q '[:find (count ?e)
11                  :where [?e :movie/title]]
12                (d/db conn)))
13   ;; => 2
14   
15   ;; As-of Day 1: only Jaws exists
16   (let [past-db (d/as-of (d/db conn) (:t (:db-after tx1)))]
17     (ffirst (d/q '[:find (count ?e)
18                    :where [?e :movie/title]]
19                  past-db)))
20   ;; => 1
21   
22   (let [past-db (d/as-of (d/db conn) (:t (:db-after tx1)))]
23     (d/q '[:find ?title
24            :where [?e :movie/title ?title]]
25          past-db))
26   ;; => #{["Jaws"]}

This capability is particularly valuable for AI systems that need to reason about what was known at a specific time, or that need to maintain an audit trail of all fact assertions and retractions.

There are two other temporal functions worth knowing about:

since — returns a database with only datoms added after a given time point. Useful for seeing what changed between two transactions.
history — returns a database containing all assertions and retractions across time. Pass this to q, datoms, or index-range to see the full history of every fact.

Incremental Fact Building

The append-only model encourages building knowledge bases incrementally. Facts accumulate over time, and queries across all facts always include historical data:

 1 ;; Day 1: add two Matrix films
 2 (d/transact conn
 3   [{:movie/title "The Matrix" :movie/year 1999 :movie/genre "Sci-Fi"}
 4    {:movie/title "The Matrix Reloaded" :movie/year 2003 :movie/genre "Sci-Fi"}])
 5 
 6 ;; Day 2: add the third Matrix film
 7 (d/transact conn
 8   [{:movie/title "The Matrix Revolutions" :movie/year 2003 :movie/genre "Sci-Fi"}])
 9 
10 ;; Query across all facts
11 (d/q '[:find ?title
12        :where [?e :movie/genre "Sci-Fi"]
13               [?e :movie/title ?title]]
14      (d/db conn))
15 ;; => #{["The Matrix"] ["The Matrix Reloaded"] ["The Matrix Revolutions"]}

Differences from Datomic Pro/Cloud

Datomic Local is free and embedded, but it has some differences from the full Datomic products:

Feature	Datomic Local	Datomic Pro/Cloud
Architecture	Embedded library	Client-server
Storage	Local disk or memory	Distributed storage
Query engine	In-process	Remote query groups
Scalability	Single process	Horizontally scalable
Scalar return (`.`)	Not supported	Supported
Collection binding (`[...]`)	Not supported	Supported
License	Apache 2.0	Commercial
Cost	Free	Paid

A Helper Function

The wrapper includes one small helper that turns a common pattern into a single call:

1 (defn find-entity [db attr val]
2   (ffirst (d/q '[:find ?e :in $ ?a ?v :where [?e ?a ?v]] db attr val)))
3 
4 ;; Usage:
5 (find-entity db :movie/title "The Matrix")
6 ;; => 96757023244367

AI & Knowledge Graph Use Cases

Datomic Local is particularly well-suited for several AI application patterns:

Knowledge graphs: The entity-attribute-ref model maps naturally to RDF-style graphs. Define schemas for your domain entities, create them with tempids, and use pull expressions to navigate relationships.
Fact accumulation: The append-only model means you can add facts incrementally from NLP pipelines, web scraping, or agent observations without losing history or dealing with update conflicts.
Temporal reasoning: as-of and since enable querying what was known at any point in time — essential for building AI systems that need to reason about the evolution of knowledge.
Explainability: Every fact assertion and retraction is preserved. You can trace exactly when and in what transaction any piece of knowledge entered the system.
Agent memory: The combination of string tempids for cross-referencing, nested pull for context retrieval, and time travel for episodic memory makes Datomic an excellent substrate for AI agent state.
Hybrid retrieval: Combine Datalog queries with full-text indexing (:db/fulltext true) for applications that need both structured and unstructured search.

Running the Tests

The test file test/datomic_local/core_test.clj contains 10 tests with 61 assertions covering the full API surface. The tests are designed to be read from top to bottom as a tutorial:

1 lein test      # run all tests
2 lein test :only datomic-local.core-test/relationships-test  # run a single test
3 lein repl      # explore interactively

The tests cover: database lifecycle, basic entities and queries, entity relationships, parameterized queries, the pull API, upsert behavior, aggregates, transaction operations, time travel, and incremental fact building.

Resources

In the next chapter we’ll continue building practical AI tools with Clojure, applying the knowledge graph techniques we’ve covered across the Jena, SPARQL, and Datomic chapters.

Up next

Using the OpenAI APIs