REFERENCE

AI Agent Glossary

20 terms every developer should know when building with AI agents.

AI Agent: An AI system that uses an LLM in a loop with tools to autonomously observe, decide, and act until a goal is met. Unlike a chatbot which responds once, an agent keeps going — making decisions, calling tools, and iterating on feedback. The core loop is: observe → decide → act → repeat.
Agent Loop: The core execution cycle that powers every AI agent. The LLM observes the current state (via tools or context), decides what to do next, takes an action (calls a tool or generates output), and repeats. The loop terminates when a stop condition is met — task completion, max iterations, or error.
MCP (Model Context Protocol): An open protocol by Anthropic that standardises how LLMs connect to external tools and data sources. Think of it as USB-C for AI — a universal interface that lets any MCP-compatible model talk to any MCP-compatible server. Replaces the proprietary tool-use implementations each framework used to build.
RAG (Retrieval-Augmented Generation): A pattern where an LLM retrieves relevant documents from a vector database before generating a response. RAG grounds the model's output in your data, reducing hallucinations and making the agent useful for domain-specific queries. Essential for any agent that needs to answer questions about internal documentation or knowledge bases.
Vertical Agent: A purpose-built AI agent designed to automate one specific workflow or business process. Unlike general-purpose assistants (Siri, Alexa), a vertical agent is narrow, deeply integrated into a specific domain, and replaces a specific job function. Examples: a sales qualification agent, a customer support triage agent, or a code review agent. The core philosophy of the Vertical Agent Method.
Tool Use (Function Calling): The ability for an LLM to call external functions — read files, execute shell commands, query databases, call APIs. Tools are what transform an LLM from a text generator into an agent. Each tool has a name, description, and input schema. The LLM decides which tool to call based on its system prompt and current context.
System Prompt: The initial set of instructions given to an LLM before any user interaction. It defines the agent's role, personality, constraints, available tools, and behaviour guidelines. A well-crafted system prompt is the difference between a useful agent and a chaotic one. Most agent failures trace back to a poorly written system prompt.
LangGraph: A framework by LangChain for building stateful, multi-step agent workflows. Agents are defined as graphs where nodes are steps (LLM calls, tool executions) and edges define the flow. LangGraph excels at complex workflows with branching, looping, and human-in-the-loop checkpoints. The most popular framework for production agent systems in 2026.
CrewAI: A multi-agent framework where specialised agents collaborate on complex tasks. Each agent has a role, goal, and backstory — like a project team. CrewAI handles the orchestration, task delegation, and communication between agents. Best suited for scenarios where different expertise areas need to collaborate: research, content creation, data analysis pipelines.
AutoGen: A multi-agent conversation framework by Microsoft for building applications with multiple LLM agents. Agents communicate with each other in structured conversations to solve tasks. AutoGen is particularly strong for research and analysis tasks where agents debate and refine answers. More research-oriented than CrewAI, less structured for production workflows.
Fine-Tuning: The process of taking a pre-trained LLM and training it further on a specific dataset to improve performance on a particular task. While powerful, fine-tuning is rarely necessary for building agents in 2026 — prompt engineering, RAG, and tool definitions usually achieve better results with less effort and cost.
Prompt Engineering: The practice of designing and refining prompts to get the best possible output from an LLM. Includes techniques like few-shot learning (providing examples), chain-of-thought (step-by-step reasoning), structured output formats, and persona setting. Prompt engineering is the most accessible way to improve agent performance — no ML degree required.
Vector Database: A database that stores data as vector embeddings — mathematical representations of meaning. Used in RAG systems to find documents semantically similar to a user's query. Popular options include Chroma (lightweight, local), Pinecone (managed, scalable), and pgvector (PostgreSQL extension). Choosing the right vector DB depends on your scale, latency, and hosting requirements.
Embedding: A numerical representation of text (or other data) as a vector in high-dimensional space. Embeddings capture semantic meaning — similar texts have similar embeddings. Generated by embedding models like OpenAI's text-embedding-3 or open-source alternatives like BGE. Essential building block for RAG, semantic search, and clustering.
Multi-Agent System: A system where multiple AI agents collaborate on a task, each with specialised roles and capabilities. Agents communicate, delegate, share context, and hand off work. Multi-agent systems excel at complex tasks that require diverse expertise but introduce overhead in coordination, cost, and latency. Best used when a single agent is insufficient.
Guardrails: Safety mechanisms that constrain an agent's behaviour within acceptable bounds. Includes input validation (blocking prompt injections), output filtering (preventing harmful content), rate limiting, cost caps, and human-in-the-loop approvals. Essential for production agents — an agent without guardrails will eventually do something unexpected and expensive.
Observability: The ability to monitor, trace, and debug an agent's internal state and decision-making process. For agents, this means logging every LLM call (input, output, cost, latency), every tool execution (success, failure, duration), and the agent's reasoning at each step. Tools like LangSmith, Weights & Biases, and custom logging pipelines are essential for production agent debugging.
The Vertical Agent Method: A named methodology for building purpose-built AI agents that replace one specific workflow. The method follows five phases: Discovery (understand the workflow), Design (define the agent architecture), Build (implement the agent loop and tools), Deploy (ship to production), and Handoff (train the client). Core philosophy: narrow scope, deep integration, fixed price, fast delivery.
Agentic AI: AI systems that can autonomously pursue goals and take actions in the world, as opposed to reactive systems that only respond to inputs. Agentic AI represents the shift from "ask and answer" to "set a goal and watch it execute." The term encompasses agents, multi-agent systems, and autonomous workflows. This is the paradigm that Agentic Up is built around.
Compound AI Systems: AI systems that combine multiple components — LLMs, retrievers, tools, databases, and traditional code — to achieve results beyond what any single model can do. Most production AI agents are compound systems. The term was popularised by Berkeley's AI research group to describe the shift from monolithic models to modular, engineered AI pipelines.

Building your first agent? Start with the step-by-step tutorial.

→ How to build your first AI agent in 2026

AI Agent Glossary

Get the brief on AI agents