Datastores

For my personal research projects the only datastores that I often use are the embedded relational database and Resource Description Framework (RDF) datastores that might be local to my laptop or public Knowledge Graphs like DBPedia and WikiData. The use of RDF data and the SPARQL query language is part of the fields of the semantic web and linked data.

Accessing Public RDF Knowledge Graphs - a DBPedia Example

I will not cover RDF data and the SPARQL query language in great detail here. Rather, please reference the following link to read the RDF and SPARQL tutorial data in my Common Lisp book: Loving Common Lisp, or the Savvy Programmer’s Secret Weapon.

In the following Racket code example for accesing data on DBPedia using SPARQL, the primary objective is to interact with DBpedia’s SPARQL endpoint to query information regarding a person based on their name or URI. The code is structured into several functions, each encapsulating a specific aspect of the querying process, thereby promoting modular design and ease of maintenance.

Function Definitions:

  • sparql-dbpedia-for-person: This function takes a person-uri as an argument and constructs a SPARQL query to retrieve the comment and website associated with the person. The @string-append macro helps in constructing the SPARQL query string by concatenating the literals and the person-uri argument.
  • sparql-dbpedia-person-uri: Similar to the above function, this function accepts a person-name argument and constructs a SPARQL query to fetch the URI and comment of the person from DBpedia.
  • sparql-query->hash: This function encapsulates the logic for sending the constructed SPARQL query to the DBpedia endpoint. It takes a query argument, encodes it into a URL format, and sends an HTTP request to the DBpedia SPARQL endpoint. The response, expected in JSON format, is then converted to a Racket expression using string->jsexpr.
  • json->listvals: This function is designed to transform the JSON expression obtained from the SPARQL endpoint into a more manageable list of values. It processes the hash data structure, extracting the relevant bindings and converting them into a list format.
  • gd (Data Processing Function): This function processes the data structure obtained from json->listvals. It defines four inner functions gg1, gg2, gg3, and gg4, each designed to handle a specific number of variables returned in the SPARQL query result. It uses a case statement to determine which inner function to call based on the length of the data.
  • sparql-dbpedia: This is the entry function which accepts a sparql argument, invokes sparql-query->hash to execute the SPARQL query, and then calls gd to process the resulting data structure.

Usage Flow:

The typical flow would be to call sparql-dbpedia-person-uri with a person’s name to obtain the person’s URI and comment from DBpedia. Following that, sparql-dbpedia-for-person can be invoked with the obtained URI to fetch more detailed information like websites associated with the person. The results from these queries are then processed through sparql-query->hash, json->listvals, and gd to transform the raw JSON response into a structured list format, making it easier to work with within the Racket environment.

  1 #lang at-exp racket
  2 
  3 (provide sparql-dbpedia-person-uri)
  4 (provide sparql-query->hash)
  5 (provide json->listvals)
  6 (provide sparql-dbpedia)
  7 
  8 (require net/url)
  9 (require net/uri-codec)
 10 (require json)
 11 (require racket/pretty)
 12 
 13 (define (sparql-dbpedia-for-person person-uri)
 14   @string-append{
 15      SELECT
 16       (GROUP_CONCAT(DISTINCT ?website; SEPARATOR="  |  ")
 17                                    AS ?website) ?comment {
 18       OPTIONAL {
 19        @person-uri
 20        <http://www.w3.org/2000/01/rdf-schema#comment>
 21        ?comment . FILTER (lang(?comment) = 'en')
 22       } .
 23       OPTIONAL {
 24        @person-uri
 25        <http://dbpedia.org/ontology/wikiPageExternalLink>
 26        ?website
 27         . FILTER( !regex(str(?website), "dbpedia", "i"))
 28       }
 29      } LIMIT 4})
 30 
 31 (define (sparql-dbpedia-person-uri person-name)
 32   @string-append{
 33     SELECT DISTINCT ?personuri ?comment {
 34       ?personuri
 35         <http://xmlns.com/foaf/0.1/name>
 36         "@person-name"@"@"en .
 37       ?personuri
 38         <http://www.w3.org/2000/01/rdf-schema#comment>
 39         ?comment .
 40              FILTER  (lang(?comment) = 'en') .
 41 }})
 42 
 43 
 44 (define (sparql-query->hash query)
 45   (call/input-url
 46    (string->url
 47     (string-append
 48      "https://dbpedia.org/sparql?query="
 49      (uri-encode query)))
 50    get-pure-port
 51    (lambda (port)
 52      (string->jsexpr (port->string port)))
 53    '("Accept: application/json")))
 54 
 55 (define (json->listvals a-hash)
 56   (let ((bindings (hash->list a-hash)))
 57     (let* ((head (first bindings))
 58            (vars (hash-ref (cdr head) 'vars))
 59            (results (second bindings)))
 60       (let* ((x (cdr results))
 61              (b (hash-ref x 'bindings)))
 62         (for/list
 63             ([var vars])
 64           (for/list ([bc b])
 65             (let ((bcequal
 66                    (make-hash (hash->list bc))))
 67               (let ((a-value
 68                      (hash-ref
 69                       (hash-ref
 70                        bcequal
 71                        (string->symbol var)) 'value)))
 72                 (list var a-value)))))))))
 73 
 74 
 75 (define gd (lambda (data)
 76 
 77     (let ((jd (json->listvals data)))
 78 
 79       (define gg1
 80         (lambda (jd) (map list (car jd))))
 81       (define gg2
 82         (lambda (jd) (map list (car jd) (cadr jd))))
 83       (define gg3
 84         (lambda (jd)
 85           (map list (car jd) (cadr jd) (caddr jd))))
 86       (define gg4
 87         (lambda (jd)
 88           (map list
 89                (car jd) (cadr jd)
 90                (caddr jd) (cadddr jd))))
 91 
 92       (case (length (json->listvals data))
 93         [(1) (gg1 (json->listvals data))]
 94         [(2) (gg2 (json->listvals data))]
 95         [(3) (gg3 (json->listvals data))]
 96         [(4) (gg4 (json->listvals data))]
 97         [else
 98          (error "sparql queries with 1 to 4 vars")]))))
 99 
100 
101 (define sparql-dbpedia
102   (lambda (sparql)
103     (gd (sparql-query->hash sparql))))
104 
105 ;; (sparql-dbpedia (sparql-dbpedia-person-uri "Steve Jobs"))

Let’s try an example in a Racket REPL:

 1 '((("personuri" "http://dbpedia.org/resource/Steve_Jobs")
 2    ("comment"
 3     "Steven Paul Jobs (February 24, 1955 – October 5, 2011) was an American entrepre\
 4 neur, industrial designer, media proprietor, and investor. He was the co-founder, ch\
 5 airman, and CEO of Apple; the chairman and majority shareholder of Pixar; a member o\
 6 f The Walt Disney Company's board of directors following its acquisition of Pixar; a\
 7 nd the founder, chairman, and CEO of NeXT. He is widely recognized as a pioneer of t\
 8 he personal computer revolution of the 1970s and 1980s, along with his early busines\
 9 s partner and fellow Apple co-founder Steve Wozniak."))
10   (("personuri" "http://dbpedia.org/resource/Steve_Jobs_(film)")
11    ("comment"
12     "Steve Jobs is a 2015 biographical drama film directed by Danny Boyle and writte\
13 n by Aaron Sorkin. A British-American co-production, it was adapted from the 2011 bi\
14 ography by Walter Isaacson and interviews conducted by Sorkin, and covers 14 years (\
15 1984–1998) in the life of Apple Inc. co-founder Steve Jobs. Jobs is portrayed by Mic\
16 hael Fassbender, with Kate Winslet as Joanna Hoffman and Seth Rogen, Katherine Water\
17 ston, Michael Stuhlbarg, and Jeff Daniels in supporting roles."))
18   (("personuri" "http://dbpedia.org/resource/Steve_Jobs_(book)")
19    ("comment"
20     "Steve Jobs is the authorized self-titled biography of American business magnate\
21  and Apple co-founder Steve Jobs. The book was written at the request of Jobs by Wal\
22 ter Isaacson, a former executive at CNN and TIME who has written best-selling biogra\
23 phies of Benjamin Franklin and Albert Einstein. The book was released on October 24,\
24  2011, by Simon & Schuster in the United States, 19 days after Jobs's death. A film \
25 adaptation written by Aaron Sorkin and directed by Danny Boyle, with Michael Fassben\
26 der starring in the title role, was released on October 9, 2015.")))

In practice, I start exploring data on DBPedia using the SPARQL query web app https://dbpedia.org/sparql. I experiment with different SPARQL queries for whatever application I am working on and then embed those queries in my Racket, Common Lisp, Clojure (link to read my Clojure AI book free online), and other programming languages I use.

In addition to using DBPedia I often also use the WikiData public Knowledge Graph and local RDF data stores hosted on my laptop with Apache Jena. I might add examples for these two use cases in future versions of this live eBook.

SQlite

Using SQlite in Racket is simple so we will just look at a single example. We will be using the Racket source file sqlite.rkt in the directory Racket-AI-book/source-code/misc_code for the code snippets in this REPL:

 1 $ racket
 2 Welcome to Racket v8.10 [cs].
 3 > (require db)
 4 > (require sqlite-table)
 5 > (define db-file "test.db")
 6 > (define db (sqlite3-connect #:database db-file #:mode 'create))
 7 > (query-exec db
 8      "create temporary table the_numbers (n integer, d varchar(20))")
 9 > (query-exec db
10      "create  table person (name varchar(30), age integer, email varchar(20))")
11 > (query-exec db
12      "insert into person values ('Mary', 34, 'mary@test.com')")
13 > (query-rows db "select * from person")
14 '(#("Mary" 34 "mary@test.com"))
15 > 

Here we see how to interact with a SQLite database using the db and sqlite-table libraries in Racket. The sqlite3-connect function is used to connect to the SQLite database specified by the string value of db-file. The #:mode ‘create keyword argument indicates that a new database should be created if it doesn’t already exist. The database connection object is bound to the identifier db.

The query-exec function call is made to create a permanent table named person with three columns: name of type varchar(30), age of type integer, and email of type varchar(20). The next query-exec function call is made to insert a new row into the person table with the values ‘Mary’, 34, and ‘mary@test.com’. There is a function query that we don’t use here that returns the types of the columns returned by a query. We use the alternative function query-rows that only returns the query results with no type information.