How Do Chatbots Work? A Guide to ChatGPT & AI Technology

An Easy-to-Understand Guide to the technology behind AI Conversations.

It happens to us all: when you input a question in ChatGPT and a minute later you already have an answer, which is so well-written and well-thought that you think there is some person on the other side. But there isn’t. What is going behind the scenes is a wonderful combination of mathematics, language, and massive amounts of data. Let’s pull back the curtain.

How do chatbots work?

To understand the magic, we have to look at how these systems are built from the ground up, starting with their education.

It Begins with A Massive Library of Language

All a chatbot can do before uttering a single word to you is to learn. And how these systems learn is, by reading not only a book or two, but a huge part of the written internet: websites, books, research articles, forums, articles, and so on. Hundreds of billions of words we are talking.

This is what enables the model to acquire a sense of context, identify patterns, acquire grammar, and achieve something that is akin to an instinct in language. It does not memorize what it reads, but it internalises the associations of words and ideas.

The Building Blocks of Understanding: Tokens

Here is a point that most people are surprised by, chatbots do not necessarily read words in the same way as we do. They divide text into tokens, small bits which may be additional whole words, a portion of a word, or even one character. An example of this is the word understanding that can be divided into two tokens: under and standing.

Why does this matter? The model converts all the information it receives in response to your query to these tokens since everything that the model processes is converted to these tokens. The model then uses numbers, but not letters, and gives the tokens a numerical value, and uses complex mathematical operations to work out what comes next. You can see this in action using this interactive Tokenizer tool to see how your own sentences are broken down.

The Transformer The Architecture That Changed Everything

The technical implementation of ChatGPT and others, then, is what is known as the Transformer architecture, which was pioneered by researchers at Google in a groundbreaking 2017 article titled Attention Is All You Need. The most important innovation in this case is a process known as self-attention. Simply put, it enables the model to consider each word within a sentence in comparison to all the other words and the extent to which each word contributes to the meaning of the entire sentence.

When you say, I left my bag in the car, it is too heavy, the model should know that it is the bag you are talking about, not the car. It figures that out with the help of self-attention. This contextual comprehension on a large scale is what makes the current chatbots infinitely more competent than the real ones of the past.

Fine-Tuning: How to make it really useful

It would be. interesting, yet not of much value, to test a raw language model that is trained on internet text alone. It may get through your sentence in the oddest manner or take off on sides which have nothing to do with your real requirement.

Here is where a finer-tuning process can occur. OpenAI and other AI laboratories use their base models and refine them by training them with curated data, examples of good conversations, good responses, and useful interactions. This directs the behavior of this model towards a more useful direction.

Among the methods that are especially critical in the development of ChatGPT, there is a method known as the Reinforcement Learning through Human Feedback (RLHF). The various model responses are rated by human trainers and the model learns through such feedback with time becoming more and more like being really useful, as opposed to statistically plausible.

Memory Or the Lack of It

The first thing that makes people surprised is that chatbots do not actually recollect past communication as a human friend. Every dialog is in a context window of a limited range of text that the model can visualize at a given time.

The model can access all the things that have been said in one conversation. It is a clean slate though as soon as you open the chat and open another one. It does not have a continuous memory transfer (unless the platform supports such an option in). This is a real limitation which engineers are in the process of trying to overcome as the technology advances.

Safety and Guardrails

This could be due to the fact that you may have noticed that these chatbots decline some requests which they will not assist you to write malware, create harmful contents or to have conversations that may cause real damage in the real world. This isn’t accidental. AI firms dedicate a lot of effort to the process of safety layering their systems.

These involve filtering of training data, training of the model to discard and reject harmful requests, and constant observing of the way the system operates in the real world environment. It is not a perfect science and no system is foolproof but the idea is to ensure that these tools can be promoted as useful without being dangerous.

Then, is it actually smart?

It is the question that all people ask themselves. This is determined by the definition of intelligence. ChatGPT and its siblings are extremely proficient in language to generate text that is coherent, contextually suitable and in some cases, actually learnable. But they do not believe, desire, or experience. What they are saying in the philosophical sense, they do not really know. They are fundamentally highly sophisticated systems of pattern matching that are constructed of extremely impressive quantities of human knowledge.

The Takeaway

The principles of Chatbots such as ChatGPT operate by integrating three primary ingredients, namely, massive training data, powerful structure to understand the context, and human feedback to modify their actions. We end up with a system capable of a conversation, able to answer, write, create code, draft email and a thousand other tasks but not because it thinks the way we do but because it has learned, at an unprecedented level, how language works.

Get 20% off today

Call Anytime

Send Email

Our Hours

How Do Chatbots Work? A Guide to ChatGPT & AI Technology