Using Google’s Knowledge Graph APIs With LangChain
Google’s Knowledge Graph (KG) is a knowledge base that Google uses to serve relevant information in an info-box beside its search results. It allows the user to see the answer in a glance, as an instant answer. The data is generated automatically from a variety of sources, covering places, people, businesses, and more. I worked at Google in 2013 on a project that used their KG for an internal project.
Google’s public Knowledge Graph Search API lets you find entities in the Google Knowledge Graph. The API uses standard schema.org types and is compliant with the JSON-LD specification. It supports entity search and lookup.
You can use the Knowledge Graph Search API to build applications that make use of Google’s Knowledge Graph. For example, you can use the API to build a search engine that returns results based on the entities in the Knowledge Graph.
In the next chapter we also use the public KGs DBPedia and Wikidata. One limitation of Google’s KG APIs is that it is designed for entity (people, places, organizations, etc.) lookup. When using DBPedia and Wikidata it is possible to find a wider range of information using the SPARQL query language, such as relationships between entities. You can use the Google KG APIs to find some entity relationships, e.g., all the movies directed by a particular director, or all the books written by a particular author. You can also use the API to find information like all the people who have worked on a particular movie, or all the actors who have appeared in a particular TV show.
Setting Up To Access Google Knowledge Graph APIs
To get an API key for Google’s Knowledge Graph Search API, you need to go to the Google API Console, enable the Google Knowledge Graph Search API, and create an API key to use in your project. You can then use this API key to make requests to the Knowledge Graph Search API.
To create your application’s API key, follow these steps:
- Go to the API Console.
- From the projects list, select a project or create a new one.
- If the APIs & services page isn’t already open, open the left side menu and select APIs & services.
- On the left, choose Credentials.
- Click Create credentials and then select API key.
You can then use this API key to make requests to the Knowledge Graph Search APIs.
When I use Google’s APIs I set the access key in ~/.google_api_key and read in the key using:
1 api_key=open(str(Path.home())+"/.google_api_key").read()
You can also use environment variables to store access keys. Here is a code snippet for making an API call to get information about me:
1 import json
2 from urllib.parse import urlencode
3 from urllib.request import urlopen
4 from pathlib import Path
5 from pprint import pprint
6
7 api_key =
8 open(str(Path.home()) + "/.google_api_key").read()
9 query = "Mark Louis Watson"
10 service_url =
11 "https://kgsearch.googleapis.com/v1/entities:search"
12 params = {
13 "query": query,
14 "limit": 10,
15 "indent": True,
16 "key": api_key,
17 }
18 url = service_url + "?" + urlencode(params)
19 response = json.loads(urlopen(url).read())
20 pprint(response)
The JSON-LD output would look like:
1 {'@context': {'@vocab': 'http://schema.org/',
2 'EntitySearchResult':
3 'goog:EntitySearchResult',
4 'detailedDescription':
5 'goog:detailedDescription',
6 'goog': 'http://schema.googleapis.com/',
7 'kg': 'http://g.co/kg',
8 'resultScore': 'goog:resultScore'},
9 '@type': 'ItemList',
10 'itemListElement': [{'@type': 'EntitySearchResult',
11 'result': {'@id': 'kg:/m/0b6_g82',
12 '@type': ['Thing',
13 'Person'],
14 'description': 'Author',
15 'name':
16 'Mark Louis Watson',
17 'url':
18 'http://markwatson.com'},
19 'resultScore': 43}]}
In order to not repeat the code for getting entity information from the Google KG, I wrote a utility Google_KG_helper.py that encapsulates the previous code and generalizes it into a mini-library.
1 """Client for calling Knowledge Graph Search API."""
2
3 import json
4 from urllib.parse import urlencode
5 from urllib.request import urlopen
6 from pathlib import Path
7 from pprint import pprint
8
9 api_key =
10 open(str(Path.home()) + "/.google_api_key").read()
11
12 # use Google search API to get information
13 # about a named entity:
14
15 def get_entity_info(entity_name):
16 service_url =
17 "https://kgsearch.googleapis.com/v1/entities:search"
18 params = {
19 "query": entity_name,
20 "limit": 1,
21 "indent": True,
22 "key": api_key,
23 }
24 url = service_url + "?" + urlencode(params)
25 response = json.loads(urlopen(url).read())
26 return response
27
28 def tree_traverse(a_dict):
29 ret = []
30 def recur(dict_2, a_list):
31 if isinstance(dict_2, dict):
32 for key, value in dict_2.items():
33 if key in ['name', 'description',
34 'articleBody']:
35 a_list += [value]
36 recur(value, a_list)
37 if isinstance(dict_2, list):
38 for x in dict_2:
39 recur(x, a_list)
40 recur(a_dict, ret)
41 return ret
42
43
44 def get_context_text(entity_name):
45 json_data = get_entity_info(entity_name)
46 return ' '.join(tree_traverse(json_data))
47
48 if __name__ == "__main__":
49 get_context_text("Bill Clinton")
The main test script is in the file Google_Knowledge_Graph_Search.py:
1 """Example of Python client calling the
2 Knowledge Graph Search API."""
3
4 from llama_index.core.schema import Document
5 from llama_index.core import VectorStoreIndex
6 import Google_KG_helper
7
8 def kg_search(entity_name, *questions):
9 ret = ""
10 context_text = Google_KG_helper.get_context_text(entity_name)
11 print(f"Context text: {context_text}")
12 doc = Document(text=context_text)
13 index = VectorStoreIndex.from_documents([doc])
14 for question in questions:
15 response = index.as_query_engine().query(question)
16 ret += f"QUESTION: {question}\nRESPONSE: {response}\n"
17 return ret
18
19 if __name__ == "__main__":
20 s = kg_search("Bill Clinton",
21 "When was Bill president?")
22 print(s)
The example output is:
1 $ python Google_Knowledge_Graph_Search.py
2 Context text: William Jefferson Clinton is an American politician who served as the \
3 42nd president of the United States from 1993 to 2001. A member of the Democratic Pa
4 rty, he previously served as Governor of Arkansas from 1979 to 1981 and again from 1
5 983 to 1992. 42nd U.S. President Bill Clinton
6 QUESTION: When was Bill president?
7 RESPONSE: Bill Clinton was president from 1993 to 2001.
Accessing Knowledge Graphs from Google, DBPedia, and Wikidata allows you to integrate real world facts and knowledge with your applications. While I mostly work in the field of deep learning I frequently also use Knowledge Graphs in my work and in my personal research. I think that you, dear reader, might find accessing highly structured data in KGs to be more reliable and in many cases simpler than using web scraping.