Understanding AI and Some Key Terminology

AI comes with a plethora of technology and terminology, much of it inscrutable to all but data scientists. Users of Chat AI don’t require an in-depth knowledge of AI terms nor the technical concepts involved. The system’s conversational nature allows intuitive interactions without specialized background knowledge of how things work. Focusing on what Chat AI can actually do is more important.
In preparing this book I’ve struggled with what would be the professional thing for me to do as an author of a book about AI. The conventional approach is to provide a short explanation of the science and a review of frequently-used terms.
I’m not going to do that.
I’m going to offer here a few external links to what I think are some reasonably comprehensible short descriptions of AI basics.
What’s the future of AI?: McKinsey & Co. (April 2024) has a good set of explainers.
Likewise Gartner’s Generative AI (undated) isn’t bad.
Futurepedia offers a not-bad summary of AI Fundamentals (Published December 2024)
Having disposed of the how-to, I’m now going to introduce some terms that I do think are valuable to understand. Not because you need to know them to use the software. Only that this set of terms points to some key aspects of how the current generation of AI actually operates.
My use case for tackling these terms and concepts is authors and publishers who (i) want to go a level deeper on AI, for whatever reason, or (ii) want to understand the context of the current criticisms of AI, or (iii) want to contribute to strategic discussions of how their colleagues or organizations should approach AI.
In other words, this is not what you need to know, but, rather, what you might like to know. Here they are, in non-alphabetical order:
Prompts and Prompting
You can open up Chat AI software and just type in a question (very much as you do currently for a Google search). For Chat AI that’s called a “prompt.” But “prompting” has developed into something more elaborate, a skill-set for how to structure your AI conversations to achieve optimal results. (Much more on this below.)
Large Language Model (LLM)
Large Language Models work by analyzing huge amounts of (mostly) written material, allowing them to predict what words or sentences should come next in a conversation or a piece of writing. They don’t ‘understand’ language in the human sense, instead processing text by breaking it down into smaller pieces (called tokens), and then converting the tokens into numbers. They process the text as numbers, regurgitating more numbers, which are then converted back into text on output. That’s an overly simplified explanation of why Chat AI does not ‘contain’ copyrighted work: it’s built with numbers that represent a vast abstraction from the underlying texts.
LLMs are trained on how language is typically used and then generate responses based on this understanding. We tend to underestimate just how predictable most language is. Chat AI can generate text that is (sometimes shockingly) similar to existing literature, but, by design, it doesn’t have the capability to retrieve specific excerpts or copies of copyrighted texts. (I know, many of you have heard about the New York Times lawsuit against OpenAI—the Times was able to get ChatGPT to regurgitate some portions of previously-published articles verbatim. That was a bug that’s been mostly fixed.)
I think it’s important to have at least a sense of the way LLMs work with language. This article on LinkedIn is about how AI handles translation, but serves as a simple primer on LLM’s process. A more comprehensive “jargon free” explanation can be found here.
Generative AI
The most important thing to understand this term is the “generative” part. Generative AI generates new text.
Generative Pre-trained Transformer (GPT)
This, the nerdiest of the terminology here, describes a specific type of LLM developed by OpenAI. “Generative” indicates its ability to create text, “pre-trained” signifies that it has been trained on a large body of text data, and “transformer” references the software that it uses. Knowing what GPT stands for is helpful only so that you understand what the GPT in ChatGPT represents.
ChatGPT
ChatGPT is the software you see; GPT is what’s behind the software. Users experience ChatGPT, not the GPT behind it. As noted above, ChatGPT is just one of several Chat AI online software systems, with similar functionality.
One more term that you’ll encounter frequently that is unfamiliar to many is:
Corpus
The dictionary definition of corpus is “a collection of written texts” (though, in fact, it’s not always text). The term is used in reference to what GPTs are trained on: vast corpuses of (mostly) text. We’re told that the largest corpuses contain trillions of words. For mere mortals that’s impossible to comprehend. Don’t you think of Wikipedia as enormous, containing a vast number of words? Well, there are a mere 4.5 billion words in Wikipedia—GPT-4 was trained on well over a trillion.
I think that it’s important to consider this scale. Authors, understandably, are worried that the 75,000 words, plus or minus, in their book might have been sucked into a large language model. Perhaps they have (more below). But assuming this is the case, consider just how little value any one book has to the magnitude of today’s large language models. It’s truly insignificant. Beyond insignificant. Even 10,000 books is chump change.