Knowledge Graph Navigator Text-Based User Interface
We developed the The Knowledge Graph Navigator (which I will often refer to as KGN) common library in the last chapter. Here we write a simple console or text-based user interface for the library. In later chapters we implement UIs using LispWorks CAPI, McCLIM, and Franz Common Graphics.
This Quicklisp library can be found in a separate GitHub repository https://github.com/mark-watson/kgn-text-ui and contains the files:
- kgn-text-ui.asd - specifies dependencies, including the KGN common library
- kgn-text-ui.lisp - Contains the complete user interface
- package.lisp - specifies dependencies, including the KGN common library
We start by looking at sample output using the text user interface and then look at the implementation.
Example Output
We will look at a very simple example query Bill Gates worked at Microsoft and his competitor was IBM that only contains a few entities. In practice, I usually use queries with five to ten entities to get more discovered relationships. I remove a lot of the generated output in the following listing for brevity, especially the many generated SPARQL queries that the code generates and uses (comments on the output appear after this listing):
1 $ sbcl
2 *(ql:quickload :kgn-text-ui)
3 ; Loading "kgn-common"
4 ; Loading "sqlite"
5 ; Loading "cl-json"
6 ; Loading "drakma"
7
8 * (kgn-text-ui:kgn-text-ui)
9
10 "Enter entity names (people, places, companies, etc.":
11 Bill Gates worked at Microsoft and his competitor was IBM
12
13 - - - - Enter zero or more indices for your desired selections:
14
15 0 - "William Henry Gates III (born October 28, 1955) is an American business magnate, software developer, investor, author, and philanthropist. He is a co-founder of Microsoft, along with his late childhood friend Paul Allen. During his career at Microsoft, Gates held the positions of chairman, chief executive officer (CEO), president and chief software architect, while also being the largest individual shareholder until May 2014. He is considered one of the best known entrepreneurs of the microcomputer revolution of the 1970s and 1980s."
16
17 1 - "Harry Roy Lewis (born 1947) is an American computer scientist, mathe 00ADma 00ADti 00ADcian, and uni 00ADver 00ADsity admin 00ADi 00ADstra 00ADtor known for his research in com 00ADpu 00ADta 00ADtional logic, textbooks in theoretical computer science, and writings on computing, higher education, and technology. He is Gordon McKay Professor of Computer Science at Harvard University, and was Dean of Harvard College from 1995 to 2003. A new professorship in Engineering and Applied Sciences, endowed by a former student, will be named for Lewis and his wife upon their retirements."
18
19 2 - "Cascade Investment, L.L.C. is an American holding company and private investment firm headquartered in Kirkland, Washington, United States. It is controlled by Bill Gates, and managed by Michael Larson. More than half of Gates' fortune is held in assets outside his holding of Microsoft shares. Cascade is the successor company to Dominion Income Management, the former investment vehicle for Gates' holdings, which was managed by convicted felon Andrew Evans."
20
21 3 - "Jerry P. Dyer (born May 3, 1959) is an American politician and former law enforcement officer. He is the 26th and current mayor of Fresno, California. Previously, he served as the chief of the Fresno Police Department."
22
23 0
24
25 - - - - Enter zero or more indices for your desired selections:
26
27 0 - "Kenexa, an IBM Company, provides employment and retention services. This includes recruitment process outsourcing onboarding tools, employee assessment, abilities assessment for employment candidates (Kenexa Prove It); and Kenexa Interview Builder, a structured interview archive with example questions."
28
29 1 - "Sequent Computer Systems was a computer company that designed and manufactured multiprocessing computer systems. They were among the pioneers in high-performance symmetric multiprocessing (SMP) open systems, innovating in both hardware (e.g., cache management and interrupt handling) and software (e.g., read-copy-update). Vestiges of Sequent's innovations live on in the form of data clustering software from PolyServe (subsequently acquired by HP), various projects within OSDL, IBM contributions to the Linux kernel, and claims in the SCO v. IBM lawsuit."
30
31 2 - "i2 Limited was the UK-based arm of software company i2 Group which produced visual intelligence and investigative analysis software for military intelligence, law enforcement and commercial agencies. After a number of acquisitions, in 2011 it became part of IBM."
32
33 3 - "The International Technology Alliance in Distributed Analytics and Information Sciences (DAIS-ITA) is a research program initiated by the UK Ministry of Defence (United Kingdom) (MOD) and the US Army Research Laboratory (ARL), in September 2016. It is led by IBM Research in the U.S. and IBM Hursley in the UK. DAIS ITA is the second International Technology Alliance started by the two countries, succeeding the previous ten year alliance NIS-ITA, which was of similar nature."
34
35 4 - "The International Technology Alliance in Network and Information Sciences (NIS-ITA) was a research program initiated by the UK Ministry of Defence (United Kingdom) (MoD) and the US Army Research Laboratory (ARL), which was active for 10 years from May 2006 to May 2016. It was led by IBM Research in the U.S. and IBM Hursley in the UK. NIS ITA was the first International Technology Alliance started by the two countries."
36
37 5 - "Applix Inc. was a computer software company founded in 1983 based in Westborough, Massachusetts that published Applix TM1, a multi-dimensional online analytical processing (MOLAP) database server, and related presentation tools, including Applix Web and Applix Executive Viewer. Together, Applix TM1, Applix Web and Applix Executive Viewer were the three core components of the Applix Business Analytics Platform. (Executive Viewer was subsequently discontinued by IBM.)"
38
39 6 - "Ounce Labs (an IBM company) is a Waltham, Massachusetts-based security software vendor. The company was founded in 2002 and created a software analysis product that analyzes source code to identify and remove security vulnerabilities. The security software looks for a range of vulnerabilities that leave an application open to attack. Customers have included GMAC, Lockheed Martin, and the U.S. Navy. On July 28, 2009, Ounce was acquired by IBM, for an undisclosed sum, with the intention of integrating it into IBM's Rational Software business."
40
41 7 - "IBM Watson Health is a digital tool that helps clients facilitate medical research, clinical research, and healthcare solutions, through the use of artificial intelligence, data, analytics, cloud computing, and other advanced information technology. It is a division of the International Business Machines Corporation, (IBM), an American multinational information technology company headquartered in Armonk, New York."
42
43 8 - "International Business Machines Corporation (IBM) is an American multinational technology corporation headquartered in Armonk, New York, with operations in over 171 countries. The company began in 1911, founded in Endicott, New York by trust businessman Charles Ranlett Flint, as the Computing-Tabulating-Recording Company (CTR) and was renamed \"International Business Machines\" in 1924. IBM is incorporated in New York."
44
45 9 - "Microsoft Corporation is an American multinational technology corporation which produces computer software, consumer electronics, personal computers, and related services. Its best known software products are the Microsoft Windows line of operating systems, the Microsoft Office suite, and the Internet Explorer and Edge web browsers. Its flagship hardware products are the Xbox video game consoles and the Microsoft Surface lineup of touchscreen personal computers. Microsoft ranked No. 21 in the 2020 Fortune 500 rankings of the largest United States corporations by total revenue; it was the world's largest software maker by revenue as of 2016. It is considered one of the Big Five companies in the U.S. information technology industry, along with Amazon, Google (Alphabet), Apple, and Facebook ("
46
47 10 - "The CSS Working Group (Cascading Style Sheets Working Group) is a working group created by the World Wide Web Consortium (W3C) in 1997, to tackle issues that had not been addressed with CSS level 1. As of December 2019, the CSSWG had 142 members. The working group is co-chaired by and ."
48
49 11 - "The AMD Professional Gamers League (PGL), founded around 1997, was one of the first professional computer gaming eSports leagues. The PGL was run by Total Entertainment Network and was sponsored by AMD. The first professional tournament they held was for StarCraft in September 1997. The league was official unveiled at a press conference at Candlestick Park in San Francisco on November 3, 1997. It was sponsored by Microsoft, Nvidia, and Levi Strauss & Co. The organization raised over $1.2mil USD in sponsorship money."
50
51 12 - "Secure Islands Technologies Ltd. was an Israeli privately held technology company headquartered in Beit Dagan which was subsequently acquired by Microsoft. The company develops and markets Information Protection and Control (IPC) solutions."
52
53 13 - "Microsoft Innovation Centers (MICs) are local government organizations, universities, industry organizations, or software or hardware vendors who partner with Microsoft with a common goal to foster the growth of local software economies. These are state of the art technology facilities which are open to students, developers, IT professionals, entrepreneurs, startups and academic researchers. While each Center tunes its programs to local needs, they all provide similar content and services designed to accelerate technology advances and stimulate local software economies through skills and professional training, industry partnerships and innovation. As of 10 September 2010, there are 115 Microsoft Innovation Centers worldwide, most of which are open to the public. Recently it was reported th"
54
55 14 - "Press Play ApS was a Danish video game development studio based in central Copenhagen in Denmark. Since 2006, Press Play have released five titles, including the Max & the Magic Marker, Max: The Curse of Brotherhood and Kalimba. On November 10, 2016, Flashbulb acquired Press Play and its library of games to republish under the Flashbulb name including Kalimba, Tentacles: Enter the Mind, and Max: The Curse of Brotherhood."
56
57 8 9
58
59 - - - ENTITY TYPE: people - - -
60
61 SPARQL to get PERSON data for <http://dbpedia.org/resource/Bill_Gates>:
62
63 "SELECT DISTINCT ?label ?comment@@ (GROUP_CONCAT (DISTINCT ?birthplace; SEPARATOR=' | ') AS ?birthplace) @@ (GROUP_CONCAT (DISTINCT ?almamater; SEPARATOR=' | ') AS ?almamater) @@ (GROUP_CONCAT (DISTINCT ?spouse; SEPARATOR=' | ') AS ?spouse) { @@ <http://dbpedia.org/resource/Bill_Gates> <http://www.w3.org/2000/01/rdf-schema#comment> ?comment .@@
64 FILTER (lang(?comment) = 'en') . @@ OPTIONAL { <http://dbpedia.org/resource/Bill_Gates> <http://dbpedia.org/ontology/birthPlace> ?birthplace } . @@ OPTIONAL { <http://dbpedia.org/resource/Bill_Gates> <http://dbpedia.org/ontology/almaMater> ?almamater } . @@ OPTIONAL { <http://dbpedia.org/resource/Bill_Gates> <http://dbpedia.org/ontology/spouse> ?spouse } . @@ OPTIONAL { <http://dbpedia.org/resource/Bill_Gates> <http://www.w3.org/2000/01/rdf-schema#label> ?label .@@ FILTER (lang(?label) = 'en') } @@ } LIMIT 10@@"
65
66
67 label: Bill Gates
68
69 comment: William Henry Gates III (born October 28, 1955) is an American business magnate, software developer, investor, author, and philanthropist. He is a co-founder of Microsoft, along with his late childhood friend Paul Allen. During his career at Microsoft, Gates held the positions of chairman, chief executive officer (CEO), president and chief software architect, while also being the largest individual shareholder until May 2014. He is considered one of the best known entrepreneurs of the microcomputer revolution of the 1970s and 1980s.
70
71 birthplace: http://dbpedia.org/resource/Seattle | http://dbpedia.org/resource/Washington_(state)
72
73 almamater:
74
75 spouse: http://dbpedia.org/resource/Melinda_French_Gates
76
77 - - - ENTITY TYPE: companies - - -
78
79 SPARQL to get COMPANY data for <http://dbpedia.org/resource/IBM>:
80
81
82 "SELECT DISTINCT ?label ?comment (GROUP_CONCAT (DISTINCT ?industry; SEPARATOR=' | ') AS ?industry)@@ (GROUP_CONCAT (DISTINCT ?netIncome; SEPARATOR=' | ') AS ?netIncome)@@ (GROUP_CONCAT (DISTINCT ?numberOfEmployees; SEPARATOR=' | ') AS ?numberOfEmployees) {@@ <http://dbpedia.org/resource/IBM> <http://www.w3.org/2000/01/rdf-schema#comment> ?comment .@@
83 FILTER (lang(?comment) = 'en') .@@ OPTIONAL { <http://dbpedia.org/resource/IBM> <http://dbpedia.org/ontology/industry> ?industry } .@@ OPTIONAL { <http://dbpedia.org/resource/IBM> <http://dbpedia.org/ontology/netIncome> ?netIncome } .@@ OPTIONAL { <http://dbpedia.org/resource/IBM> <http://dbpedia.org/ontology/numberOfEmployees> ?numberOfEmployees } .@@ OPTIONAL { <http://dbpedia.org/resource/IBM> <http://www.w3.org/2000/01/rdf-schema#label> ?label . FILTER (lang(?label) = 'en') } @@ } LIMIT 30@@"
84
85
86 label: IBM
87
88 comment: International Business Machines Corporation (IBM) is an American multinational technology corporation headquartered in Armonk, New York, with operations in over 171 countries. The company began in 1911, founded in Endicott, New York by trust businessman Charles Ranlett Flint, as the Computing-Tabulating-Recording Company (CTR) and was renamed "International Business Machines" in 1924. IBM is incorporated in New York.
89
90 industry: http://dbpedia.org/resource/Artificial_intelligence | http://dbpedia.org/resource/Automation | http://dbpedia.org/resource/Blockchain | http://dbpedia.org/resource/Cloud_computing | http://dbpedia.org/resource/Computer_hardware | http://dbpedia.org/resource/Quantum_computing | http://dbpedia.org/resource/Robotics | http://dbpedia.org/resource/Software
91
92 net-income: 5.59E9
93
94 number-of-employees: 345900
95
96 SPARQL to get COMPANY data for <http://dbpedia.org/resource/Microsoft>:
97
98
99 "SELECT DISTINCT ?label ?comment (GROUP_CONCAT (DISTINCT ?industry; SEPARATOR=' | ') AS ?industry)@@ (GROUP_CONCAT (DISTINCT ?netIncome; SEPARATOR=' | ') AS ?netIncome)@@ (GROUP_CONCAT (DISTINCT ?numberOfEmployees; SEPARATOR=' | ') AS ?numberOfEmployees) {@@ <http://dbpedia.org/resource/Microsoft> <http://www.w3.org/2000/01/rdf-schema#comment> ?comment .@@
100 FILTER (lang(?comment) = 'en') .@@ OPTIONAL { <http://dbpedia.org/resource/Microsoft> <http://dbpedia.org/ontology/industry> ?industry } .@@ OPTIONAL { <http://dbpedia.org/resource/Microsoft> <http://dbpedia.org/ontology/netIncome> ?netIncome } .@@ OPTIONAL { <http://dbpedia.org/resource/Microsoft> <http://dbpedia.org/ontology/numberOfEmployees> ?numberOfEmployees } .@@ OPTIONAL { <http://dbpedia.org/resource/Microsoft> <http://www.w3.org/2000/01/rdf-schema#label> ?label . FILTER (lang(?label) = 'en') } @@ } LIMIT 30@@"
101
102
103 label: Microsoft
104
105 comment: Microsoft Corporation is an American multinational technology corporation which produces computer software, consumer electronics, personal computers, and related services. Its best known software products are the Microsoft Windows line of operating systems, the Microsoft Office suite, and the Internet Explorer and Edge web browsers. Its flagship hardware products are the Xbox video game consoles and the Microsoft Surface lineup of touchscreen personal computers. Microsoft ranked No. 21 in the 2020 Fortune 500 rankings of the largest United States corporations by total revenue; it was the world's largest software maker by revenue as of 2016. It is considered one of the Big Five companies in the U.S. information technology industry, along with Amazon, Google (Alphabet), Apple, and Facebook (
106
107 industry: http://dbpedia.org/resource/Cloud_computing | http://dbpedia.org/resource/Computer_hardware | http://dbpedia.org/resource/Consumer_electronics | http://dbpedia.org/resource/Corporate_venture_capital | http://dbpedia.org/resource/Internet | http://dbpedia.org/resource/Social_networking_service | http://dbpedia.org/resource/Software_development | http://dbpedia.org/resource/Video_game_industry
108
109 net-income: 6.06E10
110
111 number-of-employees: 182268
112
113 DISCOVERED RELATIONSHIP LINKS:
114
115 <http://dbpedia.org/resource/Bill_Gates>
116 <http://dbpedia.org/ontology/knownFor>
117 <http://dbpedia.org/resource/Microsoft> .
118
119
120 <http://dbpedia.org/resource/Microsoft>
121 <http://dbpedia.org/ontology/foundedBy>
122 <http://dbpedia.org/resource/Bill_Gates> .
123
124
125 <http://dbpedia.org/resource/Microsoft>
126 <http://dbpedia.org/property/founders>
127 <http://dbpedia.org/resource/Bill_Gates> .
128
129 "Enter entity names (people, places, companies, etc.":
On line 10 I input a test phrase “Bill Gates worked at Microsoft and his competitor was IBM.” In lines 13-41 the test program prints out matching human entities from DBPedia that are indexed starting at 0. On line 43 I entered 0 to choose just the first entity “William Henry Gates III”.
The prompt on line 45 asks the user to enter the indices for the company DBPedia entities they want to use. These companies are listed in lines 47-152. On line 154 I entered “8 9” to select two entities to use.
Lines 156-171 show the automatically generated SPARQL query to get information about Bill Gates. This information is printed on lines 174-189. I list more generated SPARQL queries and results (which we will not discuss further).
Lines 269-283 show discovered links found between the entities in the input text.
In the LispWorks CAPI user interface developed in the next chapter I use two text output stream window panes, one for the generated SPARQL and one for the results.
Text User Interface Implementation
We will skip looking at the kgn-text-ui.asd and package.lisp files for this library but look at src/kgn-text-ui/kgn-text-ui.lisp in its entirety. When entities are identified in input text we find candidate DBPedia entity URIs that we present to the user. We precede each entire DBPedia description with an index starting at 0. The user enters the indices for entities to further process. For example, in the example listing in the previous section I entered “8 9” to indicate two company URIs.
1 (in-package #:kgn-text-ui)
2
3 (defun pprint-results (results)
4 (dolist (result (car results))
5 (terpri)
6 (format t "~A:" (first result))
7 (format t " ~A~%" (second result))))
8
9
10 (defun multiple-selections (sel-list)
11 (if (not (null sel-list))
12 (let ()
13 (pprint sel-list)
14 (format t
15 "~%- - - - Enter zero or more indices for your desired selections:~%~%")
16 (let ((count 0))
17 (dolist (sel sel-list)
18 (format t "~A - ~S ~%~%" count (cadr (assoc :comment (car sel))))
19 (setf count (1+ count))))
20 (let* ((line (read-line))
21 (indices
22 (if (> (length line) 0)
23 (mapcar
24 #'parse-integer
25 (myutils:tokenize-string line)))))
26 (print indices)
27 ;(dolist (index indices)
28 ; (setf ret (cons (nth index str-list)
29 indices))))
30
31 ;; (kgn-text-ui::multiple-selections
32 ;; '("Option 1" "Option 2" "And yet another option 3"))
33
34
35 (defun prompt-selection-list (a-list-of-choices)
36 ;; e.g., '((:people (("11" "data1") ("22" "data2"))) (:places (("p1" "data3"))))
37 (let (ret)
38 (dolist (choice a-list-of-choices)
39 (setf choice (remove-if #'null choice))
40 (let* ((topic-type (car choice))
41 (choice-list-full (rest choice))
42 (choice-list (remove-duplicates
43 (map 'list #'(lambda (z)
44 (list
45 z
46 (string-shorten
47 (kgn-common:clean-comment
48 (kgn-common:clean-comment (cadr z)))
49 140 :first-remove-stop-words t)))
50 ;; top level list flatten:
51 (apply #'append choice-list-full))
52 :test #'equal)))
53 (let (ret2
54 (dialog-results (multiple-selections choice-list)))
55 (dolist (index dialog-results)
56 (setf ret2 (cons (nth index choice-list) ret2)))
57 (if (> (length ret2) 0)
58 (setf ret (cons (list topic-type (reverse ret2)) ret))))))
59 (reverse ret)))
60
61 ;; (kgn-text-ui::prompt-selection-list
62 ;; '((:people (("11" "data1") ("22" "data2")))
63 ;; (:places (("p1" "data3") ("p2" "data4") ("p3" "data5")))))
64 ;; (kgn-text-ui::prompt-selection-list
65 ;; (get-entity-data-helper "Bill Gates went to Seattle to Microsoft"))
66
67 (defun colorize-sparql (str &key (stream t))
68 " this could be used to colorize text (as it is in kgn-capi-ui example)"
69 ;;(declare (ignore message-stream))
70 (declare (ignore stream))
71 (format t "~%~S~%" str))
72
73 (defun get-string-from-user (text-prompt)
74 (format t "~%~S:~%" text-prompt)
75 (read-line))
76
77
78 ;; Main funtion
79
80 (defun kgn-text-ui ()
81 (let (prompt
82 (message-stream t)
83 (results-stream t))
84 (loop
85 while
86 (>
87 (length
88 (setf prompt
89 (get-string-from-user
90 "Enter entity names (people, places, companies, etc.")))
91 0)
92 do
93 (let* ((entity-data (get-entity-data-helper prompt :message-stream t)))
94 (let ((user-selections (prompt-selection-list entity-data)))
95 (dolist (ev user-selections)
96 (if (> (length (cadr ev)) 0)
97 (let ()
98 (terpri results-stream)
99 (format results-stream "- - - ENTITY TYPE: ~A - - -" (car ev))
100 ;;(terpri results-stream)
101 (dolist (uri (cadr ev))
102 (setf uri (assoc :s (car uri)))
103 (case (car ev)
104 (:people
105 (pprint-results
106 (kgn-common:dbpedia-get-person-detail
107 uri
108 :message-stream message-stream
109 :colorize-sparql-function #'colorize-sparql)))
110 (:companies
111 (pprint-results
112 (kgn-common:dbpedia-get-company-detail uri
113 :message-stream message-stream
114 :colorize-sparql-function #'colorize-sparql)))
115 (:countries
116 (pprint-results
117 (kgn-common:dbpedia-get-country-detail uri
118 :message-stream message-stream
119 :colorize-sparql-function #'colorize-sparql)))
120 (:cities
121 (pprint-results
122 (kgn-common:dbpedia-get-city-detail uri
123 :message-stream message-stream
124 :colorize-sparql-function #'colorize-sparql)))
125 (:products
126 (pprint-results
127 (kgn-common:dbpedia-get-product-detail uri
128 :message-stream message-stream
129 :colorize-sparql-function #'colorize-sparql)))))))))))))
130
131 ;; (kgn-text-ui:kgn-text-ui)
The utility function multiple-selections listed in lines 10-29 displays a list of user choices, adding a zero-based index for each list item. The user can enter zero or more indices to indicate their choices using the function prompt-selection-list listed in lines 35-59.
The commented out code in lines 61-65 can be used to test these two functions.
The main function kgn-text-ui is listed in lines 80-129.
Wrap-up
In the previous chapter we implemented the Knowledge Graph Navigator library. Here we developed a text-based user interface. In the next chapter we use the library to develop a LispWorks specific CAPI user interface.