Everything You Wanted to Know About AI Agents
Curious about AI agents and how they’re shaping the future of technology? In this blog, we’ll break down everything you need to know, from what AI agents are to how they work, their real-world applications, and what makes them such a powerful tool in today’s digital world.
Why Most Explanations of AI Agents Fail:
All right. Most explanations of AI agents are either too technical or too basic. This video is meant for people like me. You have zero technical background, but you use AI tools regularly, and you want to learn just enough about AI agents to see how they affect you. In this video, we’ll follow a simple one, two, three learning path by building on concepts you already understand, like chatbot, and then moving on to AI workflows, and finally AI agents. All the while using examples you will actually encounter in real life. And believe me when I tell you those intimidating terms you see everywhere, like rag, rag, or react, they’re a lot simpler than you think. Let’s get started.
Level 1: LLMs:
Kicking things off at level one, large language models. Popular AI chatbots like CHBT, Google Gemini, and Claude are applications built on top of large language models (LLMs, and they’re fantastic at generating and editing text. Here’s a simple visualization. You, the human, provide an input, and the LLM produces an output based on its training data. For example, if I were to ask Chachi BT to draft an email requesting a coffee chat, my prompt is the input, and the resulting email that’s way more polite than I would ever be in real life is the output. So far, so good, right? Simple stuff.
LLM Limitations in Real Life:
But what if I asked Chachi BT when my next coffee chat is? Even without seeing the response, both you and I know Chachi PT is gonna fail because it doesn’t know that information. It doesn’t have access to my calendar. This highlights two key traits of large language models. First, despite being trained on vast amounts of data, they have limited knowledge of proprietary information like our personal information or internal company data. Second, LLMs are passive. They wait for our prompt and then respond. Right? Keep these two traits in mind moving forward.
Level 2: AI Workflows:
Moving to level two, AI workflows. Let’s build on our example. What if I, a human, told the LM, “Every time I ask about a personal event, perform a search query and fetch data from my Google calendar before responding.” With this logic implemented, the next time I ask, “When is my coffee chat with Elon Husky?” I’ll get the correct answer because the LLM will now first go into my Google calendar to find that information.
AI Workflows Can Be Tricky:
But here’s where it gets tricky. What if my next follow-up question is, “What will the weather be like that day?” The LM will now fail at answering the query because the path we told the LM to follow is to always search my Google calendar, which does not have information about the weather. This is a fundamental trait of AI workflows. They can only follow predefined paths set by humans. And if you want to get technical, this path is also called the control logic.
Expanding AI Workflows with Multiple Steps:
Pushing my example further, what if I added more steps into the workflow by allowing the LM to access the weather via an API, and then, just for fun, use a text-to-audio model to speak the answer? The weather forecast for seeing Elon Husky is sunny with a chance of being a good boy. Here’s the thing. No matter how many steps we add, this is still just an AI workflow. Even if there were hundreds or thousands of steps, if a human is the decision maker, there is no AI agent involvement.
Understanding Rag in AI Workflows:
Pro tip: retrieval augmented generation, or rag, is a fancy term that’s thrown around a lot. In simple terms, rag is a process that helps AI models look things up before they answer, like accessing my calendar or the weather service. Essentially, Rag is just a type of AI workflow.
Real World Example of an AI Workflow:
By the way, I have a free AI toolkit that cuts through the noise and helps you master essential AI tools and workflows. I’ll leave a link to that down below. Here’s a real-world example. Following Helena Louu’s amazing tutorial, I created a simple AI workflow using make.com. Here you can see that first I’m using Google Sheets to do something. Specifically, I’m compiling links to news articles in a Google sheet. And this is the Google sheet. Second, I’m using Perplexity to summarize those news articles. Then, using Claude and a prompt that I wrote, I’m asking Claude to draft a LinkedIn and Instagram post. Finally, I can schedule this to run automatically every day at 8 a.m.
Why This Is Still an AI Workflow:
As you can see, this is an AI workflow because it follows a predefined path set by me. Step one, you do this. Step two, you do this. Step three, you do this. And finally, remember to run daily at 8 am. One last thing, if I test this workflow and I don’t like the final output of the LinkedIn post, for example, as you can see right here, uh, it’s not funny enough, and I’m naturally hilarious, right? I’d have to manually go back and rewrite the prompt for Claude. Okay? And this trial-and-error iteration is currently being done by me, a human. So keep that in mind moving forward.
Level 3: AI Agents:
All right, level three, AI agents. Continuing the make.com example, let’s break down what I’ve been doing so far as the human decision maker. To create social media posts based on news articles, I need to do two things. First, reason or think about the best approach. I need to first compile the news articles, then summarize them, and then write the final posts. Second, take action using tools. I need to find and link to those news articles in Google Sheets. Use Perplexity for real-time summarization and then claw for copyrighting.
How AI Agents Replace the Human Decision Maker:
So, this is the most important sentence in this entire video. The one massive change that has to happen in order for this AI workflow to become an AI agent is for me, the human decision maker, to be replaced by an LLM. In other words, the AI agent must reason. What’s the most efficient way to compile these news articles? Should I copy and paste each article into a Word document? No, it’s probably easier to compile links to those articles and then use another tool to fetch the data. Yes, that makes more sense. The AI agent must act, aka do things via tools. Should I use Microsoft Word to compile links? No. Inserting links directly into rows is way more efficient. What about Excel? M. So the user has already connected their Google account with make.com. So Google Sheets is a better option.
React Framework and Key Traits of AI Agents:
Pro tip. Because of this, the most common configuration for AI agents is the React framework. All AI agents must reason and act. So react. Sounds simple once we break it down, right? A third key trait of AI agents is their ability to iterate. Remember when I had to manually rewrite the prompt to make the LinkedIn post funnier? I, the human, probably need to repeat this iterative process a few times to get something I’m happy with, right? An AI agent will be able to do the same thing autonomously. In our example, the AI agent would autonomously add another LM to critique its own output. Okay, I’ve drafted V1 of a LinkedIn post. How do I make sure it’s good? Oh, I know. I’ll add another step where an LM will critique the post based on LinkedIn best practices. And let’s repeat this until the best practices criteria are all met. And after a few cycles of that, we have the final output. That was a hypothetical example.
Real-world AI Agent Example:
So let’s move on to a real-world AI agent example. Andrew is a preeminent figure in AI, and he created this demo website that illustrates how an AI agent works. I’ll link the full video down below, but when I search for a keyword like skier, the AI vision agent in the background first reasons what a skier looks like. A person on skis going really fast in the snow, for example, right? I’m not sure. And then it’s acting by looking at clips in video footage, trying to identify what it thinks a skier is, indexing that clip, and then returning that clip to us. Although this might not feel impressive, remember that an AI agent did all that instead of a human reviewing the footage beforehand, manually identifying the skier, and adding tags like skier, mountain, ski, and snow. The programming is obviously a lot more technical and complicated than what we see in the front end, but that’s the point of this demo, right? The average user like me wants a simple app that just works without me having to understand what’s going on in the back end.
Build Your Own Basic AI Agent:
Speaking of examples, I’m also building my very own basic AI agent using Nan. So, let me know in the comments what type of AI agent you’d like me to make a tutorial on next.
Summary of the Three Levels:
To wrap up, here’s a simplified visualization of the three levels we covered today. Level one, we provide an input, and the LM responds with an output. Easy. Level two, for AI workflows, we provide an input and tell the LM to follow a predefined path that may involve retrieving information from external tools. The key trait here is that the human programs a path for LM to follow. Level three, the AI agent receives a goal and the LM performs reasoning to determine how best to achieve the goal, takes action using tools to produce an interim result, observes that interim result, decides whether iterations are required, and produces a final output that achieves the initial goal. The key trait here is that the LLM is a decision maker in the workflow.
Next Steps and Closing:
If you found this helpful, you might want to learn how to build a prompts database in Notion. See you on the next blog. In the meantime, have a great one.
Conclusion:
AI agents take automation to the next level by combining reasoning, action, and iteration, replacing the human decision maker in workflows. Unlike basic LLMs or predefined AI workflows, AI agents can independently achieve goals, adapt to new information, and refine their outputs. Understanding these three levels, LLMs, AI workflows, and AI agents, helps you see the real power of AI in practical applications and prepares you to leverage these tools for your own projects.
FAQs:
1. What is an AI agent?
An AI agent is a system that can reason, take action using tools, and iteratively refine outputs to achieve a goal without human intervention.
2. How do AI agents differ from LLMs?
Unlike LLMs, which passively generate responses, AI agents actively make decisions, execute tasks, and iterate autonomously.
3. What is an AI workflow?
An AI workflow follows a predefined path set by humans, using LLMs to fetch or process information step by step.
4. What is retrieval augmented generation (RAG)?
RAG is a workflow process where an AI model looks up external information before responding to enhance accuracy.
5. How do AI agents iterate better than humans?
They can automatically review and improve their outputs multiple times using reasoning and feedback loops without manual intervention.
6. Can I build my own AI agent?
Yes, with tools like Nan or make.com, you can create basic AI agents to automate tasks and workflows.
