Demystifying AI: Understanding the Future of Technology
This article explores the evolution of artificial intelligence, breaking down its stages from simple rule-based systems to advanced reasoning systems like ChatGPT. It explains how transformer models work, and emphasizes the practical advantages of AI in an attempt to encourage readers to embrace these tools to enhance productivity and problem-solving in everyday tasks.
Logan Whiteley
7/27/20247 min read


As long as you haven’t been living under a rock for the last couple of years you have most likely heard the term artificial intelligence being thrown around everywhere you look. Since OpenAI launched ChatGPT in November of 2022, talks of AI are nearly inescapable. Everyone seems to have their own idea of what AI truly is and how it will effect our future. Some people have thoughts of Terminator style T-800’s, while other people dream of Iron Man’s Jarvis acting as their personal sidekick. With the industry changing and innovation happening on a daily basis its hard to get quality and up to date information on what is really going on in the AI space. There is a lot of misinformation about the AI, and what its current capabilities are. Allow me to be your guide through this whirlwind of technological innovation, because I believe it will prove quite fruitful to those who learn it early.
Let’s begin with the current state of AI. In order to demystify AI’s current capabilities its important to understand the different stages of AI capability. The current literature breaks down AI into 7 stages:
Rule Based Systems: These AI systems are extremely simple, and have been around for a very long time. They operate based on a set of rules and are only able to complete one type of task. A couple examples of this would be a simple chat bot on your favorite website, or chess bots of varying skill levels.
Context Aware and Retentive Systems: These are your more basic machine learning algorithms. They are able to take data, analyze patterns, learn from it and slowly improve at one task over time. A good example of this would be a spam filter for your email account.
Domain-Specific Expert Systems: These are complex systems designed to perform a task with high proficiency. These systems demonstrate deep understanding of the tasks they are designed to accomplish. A couple of good examples of this are diagnostic tools in healthcare, and autonomous vehicles.
Reasoning Systems: AI that doesn’t just respond to a set of commands, but also is able to understand the context of a conversation. These models are able to infer meaning based on what it’s told, and are able to generate human language that sounds very natural. These AI models are able to understand what you mean even if you don’t say it 100% correctly. If you tell them you need a coat, they know that it will be cold, and if you ask them to summarize an article, they can tell you the most important bits. The best example of this is ChatGPT.
Artificial General Intelligence (AGI): This is the stage where AI models finally start exhibiting human-like behavior. The AI will be able to understand, learn, and interact in natural ways. They will be able to perform any task you can think of to extreme proficiency. This is also the stage where it is believed that the AI will be considered self-aware or sentient.
Artificial Super Intelligence (ASI): This is the stage that we commonly see in science fiction. In this stage AI is so far beyond the intellect of humans, that they are able to solve problems that humans can’t even think of. It is predicted that these AI systems would bring technological advancement at rates never seen before. This is where a lot of AI naysayers also think that we will begin losing control of these systems, since we can no longer comprehend what they are doing.
Singularity and Transcendence: This is the hypothetical final stage of AI evolution where technology snowballs so far so fast that massive changes to humanity as we know it are inevitable. People argue what the results will be, but it ranges from perfected society to the complete annihilation of humanity.
Spooky right? Let me help temper some of those feelings. We are currently sitting at stage 4 with the most frontier models like OpenAI’s ChatGPT-4o, and Anthropic’s Claude 3.5 Sonnet. The most optimistic estimates put AGI more than a few years away. When I first heard this my thought was ok great, so I basically only have a few more years until robots take over my job. The thing that helped tame these pessimistic thoughts was really understanding the basics of how these models work and what advantages that they can provide me on a day to day basis.
So how do these models work then? The explosion of these Large Language Models (LLMs) happened after the invention of the transformer architecture by Google. This method takes data as an input and breaks it down into smaller pieces called tokens that are represented by numbers. The model then take these tokens and learns how they relate to each other through immense repetition.
To simplify, let's use two sentences as an example:
The cat can jump really high.
To be agile one must be able to jump really well.
Here’s how the transformer model processes these sentences:
Tokenization: The model breaks down the sentences into tokens, which can be words or subwords. For simplicity, let's assign a number to each word:
Sentence 1: The(1), cat(2), can(3), jump(4), really(5), high(6)
Sentence 2: To(7), be(8), agile(9), one(10), must(11), be(12), able(13), to(14), jump(15), really(16), well(17)
Attention Mechanism: The transformer uses an attention mechanism to focus on important words and their relationships. It recognizes that "jump" appears in both sentences and starts making connections:
It sees that "cat" (2) and "high" (6) in sentence 1 are related to "jump" (4).
It also sees that "agile" (9) and "well" (17) in sentence 2 are related to "jump" (15).
Contextual Understanding: The model understands that "jump" in both sentences refers to the same action. It then relates "cat" to "agile" and "high" to "well".
Generating New Sentences: Using these relationships, the model can generate new, coherent sentences. If you ask it to describe the cat, it might combine the related concepts and say:
"The cat is agile and can jump high."
In essence, transformer models ingest vast amounts of data and use the attention mechanism to identify relationships between tokens. By learning these patterns, they can generate new text that makes sense based on the context they’ve learned. Despite their impressive capabilities, transformer models are not magic; they work through a process of trial and error, refining their understanding over countless iterations. On top of all of that LLM’s are only able to follow very specific instructions. These models have no free will or autonomy. They require a driver (you) to give them a direction and purpose.
If this doesn’t help calm the nerves slightly, allow me to tell you how this can be a huge advantage to you. Without a doubt, the earlier that you start becoming proficient with AI the better off you will be. First off you are seeing the power of LLM’s like ChatGPT while reading this article. I came up with the example above, and wrote it down to the best of my ability. I admittedly thought that first iteration was pretty good, but things got a little messy with organization and clarity, specifically when referring to the word and number pairs. I knew this could be confusing, so I put my draft into ChatGPT and asked it to improve in any way it saw fit. The result is what you read above. At the end of this post, I will paste the original explanation so that you can see the difference.
Let’s say you are at work and you have a process that you absolutely hate doing. It’s not hard, but it is a boring and repetitive excel routine. You know that there has to be a faster way to do it, but you don’t have the technical skills to automate the task or simply don’t know where to begin. LLM’s have you you more than covered. All you need to do is open your LLM of choice, type out your process in plain english as descriptively as you can manage, and be amazed as it spits out an entire code in seconds. You don’t know how to run the code? No problem just tell the AI that and it will give you as detailed of instructions as you will need in order to get everything working. Now AI finishes guiding you through the setup, you run the code for the first time, and it’s a total success. The dopamine you get for doing this is on another level. You have made a task that you used to dread completely trivial, and your boss is impressed due to your new increased productivity. This is the power of AI, and this is why you need to start now. You live in a world where as long as you can describe your problem and type in a chat window, there is a good chance AI can help you solve it.
The craziest part about all of this is that most AI tools are free to varying degrees and accessible to everyone. This means that there really is no excuse. Be the person that takes the initiative, you are almost guaranteed to benefit from the outcome. Anyone who has spent any amount of time in the corporate world knows there are A LOT of processes that are inefficient. Everyone suspects there is a better way to do it but nothing changes, because that’s how the task has always been done. Those are process improvements ripe for the picking, and AI will be the tool to help you do it. If you don’t I can assure you someone else will, and they will gobble up all of the low hanging fruit.
Whiteley World Blog,
Logan Whiteley
P.S. Below is the original description I wrote to explain how the LLM’s work. It’s not bad, but definitely a little bit confusing and lacks polish. Instead of writing and rewriting this section. A quick ChatGPT prompt made both my life and yours a little bit easier.
The explosion of these Large Language Models (LLMs) happened after the invention of the transformer architecture by Google. This method takes data as an input and breaks it down into smaller pieces called tokens that are represented by a numbers. The model then take these tokens and learn how they relate to each other through immense repetition. To simplify it further, let’s say you gave the model two seemingly unrelated sentences:
The cat can jump really high.
To be agile one must be able to jump really well.
In each sentence the model will assign a each set of characters a number. To keep it simple will assign a number to each word, counting up from one. The model will see that the words cat(2), jump(4), and high(6) in sentence one go together and then see that agile(9), jump(15), and well(17) in sentence two also go together. From there it then will start to look for relationships between the two sentences. Since the word jump appears in both sentences, so it can now start making connections. The model then thinks if jump(4) and jump(15) are the same, then maybe you can make a group with cat(2), agile(9), jump(4), and high(6). So when you ask the model to describe the cat. It is able to say a new third sentence:
The cat is agile and can jump high.
Now this is very simplified and there is definitely more to it than just that, but in essence that is really all the model is doing. Its ingesting incomprehensible amounts of data, and then performing this routine on all of it until it has relations between everything that it has learned.