Knowledge Graph Navigator
The Knowledge Graph Navigator (which I will often refer to as KGN) is a tool for processing a set of entity names and automatically exploring the public Knowledge Graph DBPedia using SPARQL queries. I started to write KGN for my own use to automate some things I used to do manually when exploring Knowledge Graphs, and later thought that KGN might be also useful for educational purposes. KGN shows the user the auto-generated SPARQL queries so hopefully the user will learn by seeing examples. KGN uses the SPARQL queries.
I cover SPARQL and linked data/knowledge Graphs is previous books I have written and while I give you a brief background here, I ask interested users to look at either for more details:
- The chapter Knowledge Graph Navigator in my book Loving Common Lisp, or the Savvy Programmer’s Secret Weapon
- The chapters Background Material for the Semantic Web and Knowledge Graphs, Knowledge Graph Navigator in my book Practical Artificial Intelligence Programming With Clojure
We use the Natural Language Processing (NLP) library from the last chapter to find human and place names in input text and then construct SPARQL queries to access data from DBPedia.
The KGN application is still a work in progress so please check for updates to this live eBook. The following screenshots show the current version of the application:
I have implemented parts of KGN in several languages: Common Lisp, Java, Clojure, Racket Scheme, Swift, Python, and Hy. The most full featured version of KGN, including a full user interface, is featured in my book Loving Common Lisp, or the Savvy Programmer’s Secret Weapon that you can read free online. That version performs more speculative SPARQL queries to find information compared to the example here that I designed for ease of understanding, and modification. I am not covering the basics of RDF data and SPARQL queries here. While I provide sufficient background material to understand the code, please read the relevant chapters in my Common Lisp book for more background material.
We will be running an example using data containing three person entities, one company entity, and one place entity. The following figure shows a very small part of the DBPedia Knowledge Graph that is centered around these entities. The data for this figure was collected by an example Knowledge Graph Creator from my Common Lisp book:
I chose to use DBPedia instead of WikiData for this example because DBPedia URIs are human readable. The following URIs represent the concept of a person. The semantic meanings of DBPedia and FOAF (friend of a friend) URIs are self-evident to a human reader while the WikiData URI is not:
http://www.wikidata.org/entity/Q215627
http://dbpedia.org/ontology/Person
http://xmlns.com/foaf/0.1/name
I frequently use WikiData in my work and WikiData is one of the most useful public knowledge bases. I have both DBPedia and WikiData SPARQL endpoints in the example code that we will look at later, with the WikiData endpoint comment out. You can try manually querying WikiData at the WikiData SPARQL endpoint. For example, you might explore the WikiData URI for the person concept using:
select ?p ?o where {
<http://www.wikidata.org/entity/Q215627> ?p ?o .
} limit 10
For the rest of this chapter we will just use DBPedia or data copied from DBPedia.
After looking at an interactive session using the example program for this chapter we will look at the implementation.
Entity Types Handled by KGN
To keep this example simple we handle just two entity types:
- People
- Places
The Common Lisp version of KGN also searches for relationships between entities. This search process consists of generating a series of SPARQL queries and calling the DBPedia SPARQL endpoint. I may add this feature to the Racket version of KGN in the future.
KGN Implementation
The example application works processing a list or Person, Place, and Organization names. We generate SPARQL queries to DBPedia to find information about the entities and relationships between them.
We are using two libraries developed for this book that can be found in the directories Racket-AI-book/source-code/sparql and Racket-AI-book/source-code/nlp to supply support for SPARQL queries and natural language processing.
SPARQL Client Library
We already looked at code examples for making simple SPARQL queries in the chapter Datastores and here we continue with more examples that we need to the KGN application.
The following listing shows Racket-AI-book/source-code/sparql/sparql.rkt where we implement several functions for interacting with DBPedia’s SPARQL endpoint. There are two functions sparql-dbpedia-for-person and sparql-dbpedia-person-uri crafted for constructing SPARQL queries. The function sparql-dbpedia-for-person takes a person URI and formulates a query to fetch associated website links and comments, limiting the results to four. On the other hand, the function sparql-dbpedia-person-uri takes a person name and builds a query to obtain the person’s URI and comments from DBpedia. Both functions utilize string manipulation to embed the input parameters into the SPARQL query strings. There are similar functions for places.
Another function sparql-query->hash executes SPARQL queries against the DBPedia endpoint. It takes a SPARQL query string as an argument, sends an HTTP request to the DBpedia SPARQL endpoint, and expects a JSON response. The call/input-url function is used to send the request, with uri-encode ensuring the query string is URL-encoded. The response is read from the port, converted to a JSON expression using the function string->jsexpr, and is expected to be in a hash form which is returned by this function.
Lastly, there are two functions json->listvals and gd for processing the JSON response from DBPedia. The function json->listvals extracts the variable bindings from the SPARQL result and organizes them into lists. The function gd further processes these lists based on the number of variables in the query result, creating lists of lists which represent the variable bindings in a structured way. The sparql-dbpedia function serves as an interface to these functionalities, taking a SPARQL query string, executing the query via sparql-query->hash, and processing the results through gd to provide a structured output. This arrangement encapsulates the process of querying DBPedia and formatting the results, making it convenient for further use within a Racket program.
We already saw most of the following code listing in the previous chapter Datastores. The following listings in this chapter will be updated in future versions of this live eBook when I finish writing the KGN application.
Part of solving this problem is constructing SPARQL queries as strings. We will look in some detail at one utility function sparql-dbpedia-for-person that constructs a SPARQL query string for fetching data from DBpedia about a specific person. The function takes one parameter, person-uri, which is expected to be the URI of a person in the DBpedia dataset. The query string is built by appending strings, including the dynamic insertion of the person-uri parameter value. Here’s a breakdown of how the code works:
-
Function Definition: The function
sparql-dbpedia-for-personis defined with one parameter,person-uri. This parameter is used to dynamically insert the person’s URI into the SPARQL query. -
String Appending (
@string-append): The@string-appendconstruct (which seems like a custom or pseudo-syntax, as the standard Scheme function for string concatenation isstring-appendwithout the@) is used to concatenate multiple strings to form the complete SPARQL query. This includes static parts of the query as well as dynamic parts where theperson-uriis inserted. -
SPARQL Query Construction: The function constructs a SPARQL query with the following key components:
-
SELECT Clause: This part of the query specifies what information to return. It uses
GROUP_CONCATto aggregate multiple?websitevalues into a single string, separated by” | “, and also selects the?commentvariable. -
OPTIONAL Clauses: Two OPTIONAL blocks are included:
- The first block attempts to fetch English comments (
?comment) associated with the person, filtering to ensure the language of the comment is English (lang(?comment) = ‘en’). - The second block fetches external links (
?website) associated with the person but filters out any URLs containing “dbpedia” (case-insensitive), to likely avoid self-references within DBpedia.
- The first block attempts to fetch English comments (
-
Dynamic URI Insertion: The
@person-uriplaceholder is replaced with the actualperson-uripassed to the function. This dynamically targets the SPARQL query at a specific DBpedia resource. -
LIMIT Clause: The query is limited to return at most 4 results with
LIMIT 4.
-
SELECT Clause: This part of the query specifies what information to return. It uses
-
Usage of
@person-uriPlaceholder: The code shows@person-uriplaceholders within the query string, indicating where theperson-uriparameter’s value should be inserted. However, the mechanism for replacing these placeholders with the actual URI value is not explicitly shown in the snippet. Typically, this would involve string replacement functionality, ensuring the final query string includes the specific URI of the person of interest.
In summary, the sparql-dbpedia-for-person function dynamically constructs a SPARQL query to fetch English comments and external links (excluding DBpedia links) for a given person from DBpedia, with the results limited to a maximum of 4 entries. The use of string concatenation (or a pseudo-syntax resembling @string-append) allows for the dynamic insertion of the person’s URI into the query.
1 (provide sparql-dbpedia-person-uri)
2 (provide sparql-query->hash)
3 (provide json->listvals)
4 (provide sparql-dbpedia)
5
6 (require net/url)
7 (require net/uri-codec)
8 (require json)
9 (require racket/pretty)
10
11 (define (sparql-dbpedia-for-person person-uri)
12 @string-append{
13 SELECT
14 (GROUP_CONCAT(DISTINCT ?website; SEPARATOR=" | ")
15 AS ?website) ?comment {
16 OPTIONAL {
17 @person-uri
18 <http://www.w3.org/2000/01/rdf-schema#comment>
19 ?comment . FILTER (lang(?comment) = 'en')
20 } .
21 OPTIONAL {
22 @person-uri
23 <http://dbpedia.org/ontology/wikiPageExternalLink>
24 ?website
25 . FILTER( !regex(str(?website), "dbpedia", "i"))
26 }
27 } LIMIT 4})
28
29 (define (sparql-dbpedia-person-uri person-name)
30 @string-append{
31 SELECT DISTINCT ?personuri ?comment {
32 ?personuri
33 <http://xmlns.com/foaf/0.1/name>
34 "@person-name"@"@"en .
35 ?personuri
36 <http://www.w3.org/2000/01/rdf-schema#comment>
37 ?comment .
38 FILTER (lang(?comment) = 'en') .
39 }})
40
41
42 (define (sparql-query->hash query)
43 (call/input-url
44 (string->url
45 (string-append
46 "https://dbpedia.org/sparql?query="
47 (uri-encode query)))
48 get-pure-port
49 (lambda (port)
50 (string->jsexpr (port->string port)))
51 '("Accept: application/json")))
52
53 (define (json->listvals a-hash)
54 (let ((bindings (hash->list a-hash)))
55 (let* ((head (first bindings))
56 (vars (hash-ref (cdr head) 'vars))
57 (results (second bindings)))
58 (let* ((x (cdr results))
59 (b (hash-ref x 'bindings)))
60 (for/list
61 ([var vars])
62 (for/list ([bc b])
63 (let ((bcequal
64 (make-hash (hash->list bc))))
65 (let ((a-value
66 (hash-ref
67 (hash-ref
68 bcequal
69 (string->symbol var)) 'value)))
70 (list var a-value)))))))))
71
72
73 (define gd (lambda (data)
74
75 (let ((jd (json->listvals data)))
76
77 (define gg1
78 (lambda (jd) (map list (car jd))))
79 (define gg2
80 (lambda (jd) (map list (car jd) (cadr jd))))
81 (define gg3
82 (lambda (jd)
83 (map list (car jd) (cadr jd) (caddr jd))))
84 (define gg4
85 (lambda (jd)
86 (map list
87 (car jd) (cadr jd)
88 (caddr jd) (cadddr jd))))
89
90 (case (length (json->listvals data))
91 [(1) (gg1 (json->listvals data))]
92 [(2) (gg2 (json->listvals data))]
93 [(3) (gg3 (json->listvals data))]
94 [(4) (gg4 (json->listvals data))]
95 [else
96 (error "sparql queries with 1 to 4 vars")]))))
97
98
99 (define sparql-dbpedia
100 (lambda (sparql)
101 (gd (sparql-query->hash sparql)))
The function gd converts JSON data to Scheme nested lists and then extracts the values for up to four variables.
NLP Library
We implemented a library in the chapter Natural Language Processing that we use here.
Please make sure you have read that chapter before the following sections.
Implementation of KGN Application Code
The file Racket-AI-book/source-code/kgn/main.rkt contains library boilerplate and the file Racket-AI-book/source-code/kgn/kgn.rkt the application code. The provided Racket scheme code is structured for interacting with the DBPedia SPARQL endpoint to retrieve information about persons or places based on a user’s string query. The code is organized into several defined functions aimed at handling different steps of the process:
Query Parsing and Entity Recognition:
The parse-query function takes a string query-str and tokenizes it into a list of words after replacing certain characters (like “.” and “?”). It then checks for keywords like “who” or “where” to infer the type of query - person or place. Using find-human-names and find-place-names (defined in the earlier section on SPARQL), it extracts the entity names from the tokens. Depending on the type of query and the entities found, it returns a list indicating the type and name of the entity, or unknown if no relevant entities are identified.
SPARQL Query Construction and Execution:
The functions get-person-results and get-place-results take a name string, construct a SPARQL query to get information about the entity from DBPedia, execute the query, and process the results. They utilize the sparql-dbpedia-person-uri, sparql-query->hash, and json->listvals functions that we listed previously to construct the query, execute it, and convert the returned JSON data to a list, respectively.
Query Interface:
The ui-query-helper function acts as the top-level utility for processing a string query to generate a SPARQL query, execute it, and return the results. It first calls parse-query to understand the type of query and the entity in question. Depending on whether the query is about a person or a place, it invokes get-person-results or get-place-results, respectively, to get the relevant information from DBPedia. It then returns a list containing the SPARQL query and the results, or #f if the query type is unknown.
This code structure facilitates the breakdown of a user’s natural language query into actionable SPARQL queries to retrieve and present information about identified entities from a structured data source like DBPedia.
1 (require racket/pretty)
2 (require nlp)
3
4 (provide get-person-results)
5 (provide ui-query-helper)
6
7 (require "sparql-utils.rkt")
8
9 (define (get-person-results person-name-string)
10 (let ((person-uri (sparql-dbpedia-person-uri person-name-string)))
11 (let* ((hash-data (sparql-query->hash person-uri)))
12 (list
13 person-uri
14 (extract-name-uri-and-comment
15 (first (json->listvals hash-data)) (second (json->listvals hash-data)))))))
16
17
18 (define (get-place-results place-name-string)
19 (let ((place-uri (sparql-dbpedia-place-uri place-name-string)))
20 (let* ((hash-data (sparql-query->hash place-uri)))
21 (list
22 place-uri
23 (extract-name-uri-and-comment
24 (first (json->listvals hash-data)) (second (json->listvals hash-data)))))))
25
26
27 (define (parse-query query-str)
28 (let ((cleaned-query-tokens
29 (string-split (string-replace (string-replace query-str "." " ") "?" " "))\
30 ))
31 (printf "\n+ + + cleaned-query-tokens:~a\n" cleaned-query-tokens)
32 (if (member "who" cleaned-query-tokens)
33 (let ((person-names (find-human-names (list->vector cleaned-query-tokens) '(\
34 ))))
35 (printf "\n+ + person-names= ~a\n" person-names)
36 (if (> (length person-names) 0)
37 (list 'person (first person-names)) ;; for now, return the first nam\
38 e found
39 #f))
40 (if (member "where" cleaned-query-tokens)
41 (let ((place-names (find-place-names (list->vector cleaned-query-tokens)\
42 '())))
43 (printf "\n+ + place-names= ~a\n" place-names)
44 (if (> (length place-names) 0)
45 (list 'place (first place-names)) ;; for now, return the first p\
46 lace name found
47 (list 'unknown query-str)))
48 (list 'unknown query-str))))) ;; no person or place name match so just r\
49 eturn original query
50
51 (define (ui-query-helper query-str) ;; top level utility function for string query \
52 -> 1) generated sparql 2) query function
53 (display "in ui-query-helper: query-str=") (display query-str)
54 (let* ((parse-results (parse-query query-str))
55 (question-type (first parse-results))
56 (entity-name (second parse-results)))
57 (display (list parse-results question-type entity-name))
58 (if (equal? question-type 'person)
59 (let* ((results2 (get-person-results entity-name))
60 (sparql (car results2))
61 (results (second results2)))
62 (printf "\n++ results: ~a\n" results)
63 (list sparql results))
64 (if (equal? question-type 'place)
65 (let* ((results2 (get-place-results entity-name))
66 (sparql (car results2))
67 (results (second results2)))
68 (list sparql results))
69 #f))))
The file Racket-AI-book/source-code/kgn/dialog-utils.rkt contains the user interface specific code for implementing a dialog box.
1 (require htdp/gui)
2 (require racket/gui/base)
3 (require racket/pretty)
4 (provide make-selection-functions)
5
6 (define (make-selection-functions parent-frame title)
7 (let* ((dialog
8 (new dialog%
9 [label title]
10 [parent parent-frame]
11 [width 440]
12 [height 480]))
13 (close-callback
14 (lambda (button event)
15 (send dialog show #f)))
16 (entity-chooser-dialog-list-box
17 (new list-box%
18 [label ""]
19 [choices (list "aaaa" "bbbb")]
20 [parent dialog]
21 [callback (lambda (click event)
22 (if (equal? (send event get-event-type) 'list-box-dclick)
23 (close-callback click event)
24 #f))]))
25 (quit-button
26 (new button% [parent dialog]
27 [label "Select an entity"]
28 [callback close-callback]))
29 (set-new-items-and-show-dialog
30 (lambda (a-list)
31 (send entity-chooser-dialog-list-box set-selection 0)
32 (send entity-chooser-dialog-list-box set a-list)
33 (send dialog show #t)))
34 (get-selection-index (lambda () (first (send entity-chooser-dialog-list-box\
35 get-selections)))))
36 (list set-new-items-and-show-dialog get-selection-index)))
The local file sparql-utils.rkt contains additional utility functions for accessing information in DBPedia.
1 (provide sparql-dbpedia-for-person)
2 (provide sparql-dbpedia-person-uri)
3 (provide sparql-query->hash)
4 (provide json->listvals)
5 (provide extract-name-uri-and-comment)
6
7 (require net/url)
8 (require net/uri-codec)
9 (require json)
10 (require racket/pretty)
11
12 (define ps-encoded-by "ps:P702")
13 (define wdt-instance-of "wdt:P31")
14 (define wdt-in-taxon "wdt:P703")
15 (define wd-human "wd:Q15978631")
16 (define wd-mouse "wd:Q83310")
17 (define wd-rat "wd:Q184224")
18 (define wd-gene "wd:Q7187")
19
20 (define (sparql-dbpedia-for-person person-uri)
21 @string-append{
22 SELECT
23 (GROUP_CONCAT(DISTINCT ?website; SEPARATOR=" | ") AS ?website) ?comment {
24 OPTIONAL { @person-uri <http://www.w3.org/2000/01/rdf-schema#comment> ?comment\
25 . FILTER (lang(?comment) = 'en') } .
26 OPTIONAL { @person-uri <http://dbpedia.org/ontology/wikiPageExternalLink> ?web\
27 site . FILTER( !regex(str(?website), "dbpedia", "i"))} .
28 } LIMIT 4})
29
30 (define (sparql-dbpedia-person-uri person-name)
31 @string-append{
32 SELECT DISTINCT ?personuri ?comment {
33 ?personuri <http://xmlns.com/foaf/0.1/name> "@person-name"@"@"en .
34 ?personuri <http://www.w3.org/2000/01/rdf-schema#comment> ?comment .
35 FILTER (lang(?comment) = 'en') .
36 }})
37
38 (define (sparql-query->hash query)
39 (call/input-url (string->url (string-append "https://dbpedia.org/sparql?query=" (u\
40 ri-encode query)))
41 get-pure-port
42 (lambda (port)
43 (string->jsexpr (port->string port))
44 )
45 '("Accept: application/json")))
46
47 (define (json->listvals a-hash)
48 (let ((bindings (hash->list a-hash)))
49 (let* ((head (first bindings))
50 (vars (hash-ref (cdr head) 'vars))
51 (results (second bindings)))
52 (let* ((x (cdr results))
53 (b (hash-ref x 'bindings)))
54 (for/list ([var vars])
55 (for/list ([bc b])
56 (let ((bcequal (make-hash (hash->list bc))))
57 (let ((a-value (hash-ref (hash-ref bcequal (string->symbol var\
58 )) 'value)))
59 (list var a-value)))))))))
60
61 (define extract-name-uri-and-comment (lambda (l1 l2)
62 (map ;; perform a "zip" action on teo lists
63 (lambda (a b)
64 (list (second a) (second b)))
65 l1 l2)))
The local file kgn.rkt is the main program for this application.
1 (require htdp/gui) ;; note: when building executable, choose GRacket, not Racket to\
2 get *.app bundle
3 (require racket/gui/base)
4 (require racket/match)
5 (require racket/pretty)
6 (require scribble/text/wrap)
7
8 ;; Sample queries:
9 ;; who is Bill Gates
10 ;; where is San Francisco
11 ;; (only who/where queries are currently handled)
12
13 ;;(require "utils.rkt")
14 (require nlp)
15 (require "main.rkt")
16 (require "dialog-utils.rkt")
17
18 (define count-substring
19 (compose length regexp-match*))
20
21 (define (short-string s)
22 (if (< (string-length s) 75)
23 s
24 (substring s 0 73)))
25
26 (define dummy (lambda (button event) (display "\ndummy\n"))) ;; this will be redefin\
27 ed after UI objects are created
28
29 (let ((query-callback (lambda (button event) (dummy button event))))
30 (match-let* ([frame (new frame% [label "Knowledge Graph Navigator"]
31 [height 400] [width 608] [alignment '(left top)])]
32 [(list set-new-items-and-show-dialog get-selection-index) ; returns l\
33 ist of 2 callback functions
34 (make-selection-functions frame "Test selection list")]
35 [query-field (new text-field%
36 [label " Query:"] [parent frame]
37 [callback
38 (lambda( k e)
39 (if (equal? (send e get-event-type) 'text-field-\
40 enter) (query-callback k e) #f))])]
41 [a-container (new pane%
42 [parent frame] [alignment '(left top)])]
43 [a-message (new message%
44 [parent frame] [label " Generated SPARQL:"])]
45 [sparql-canvas (new text-field%
46 (parent frame) (label "")
47 [min-width 380] [min-height 200]
48 [enabled #f])]
49 [a-message-2 (new message% [parent frame] [label " Results:"])]
50 [results-canvas (new text-field%
51 (parent frame) (label "")
52 [min-height 200] [enabled #f])]
53 [a-button (new button% [parent a-container]
54 [label "Process: query -> generated SPARQL -> results \
55 from DBPedia"]
56 [callback query-callback])])
57 (display "\nbefore setting new query-callback\n")
58 (set!
59 dummy ;; override dummy labmda defined earlier
60 (lambda (button event)
61 (display "\n+ in query-callback\n")
62 (let ((query-helper-results-all (ui-query-helper (send query-field get-value)\
63 )))
64 (if (equal? query-helper-results-all #f)
65 (let ()
66 (send sparql-canvas set-value "no generated SPARQL")
67 (send results-canvas set-value "no results"))
68 (let* ((sparql-results (first query-helper-results-all))
69 (query-helper-results-uri-and-description (cadr query-helper-res\
70 ults-all))
71 (uris (map first query-helper-results-uri-and-description))
72 (query-helper-results (map second query-helper-results-uri-and-d\
73 escription)))
74 (display "\n++ query-helper-results:\n") (display query-helper-result\
75 s) (display "\n")
76 (if (= (length query-helper-results) 1)
77 (let ()
78 (send sparql-canvas set-value sparql-results)
79 (send results-canvas set-value
80 (string-append (string-join (wrap-line (first query-hel\
81 per-results) 95) "\n") "\n\n" (first uris))))
82 (if (> (length query-helper-results) 1)
83 (let ()
84 (set-new-items-and-show-dialog (map short-string query-help\
85 er-results))
86 (set! query-helper-results
87 (let ((sel-index (get-selection-index)))
88 (if (> sel-index -1)
89 (list-ref query-helper-results sel-index)
90 '(""))))
91 (set! uris (list-ref uris (get-selection-index)))
92 (display query-helper-results)
93 (send sparql-canvas set-value sparql-results)
94 (send results-canvas set-value
95 (string-append (string-join (wrap-line query-helper\
96 -results 95) "\n") "\n\n" uris)))
97 (send results-canvas set-value (string-append "No results fo\
98 r: " (send query-field get-value))))))))))
99 (send frame show #t)))
The two screen shot figures seen earlier show the GUI application running.
Knowledge Graph Navigator Wrap Up
This KGN example was hopefully both interesting to you and simple enough in its implementation to use as a jumping off point for your own projects.
I had the idea for the KGN application because I was spending quite a bit of time manually setting up SPARQL queries for DBPedia (and other public sources like WikiData) and I wanted to experiment with partially automating this process. I have experimented with versions of KGN written in Java, Hy language (Lisp running on Python that I wrote a short book on), Swift, and Common Lisp and all four implementations take different approaches as I experimented with different ideas.