What tools do I need to build an AI agent?

You need an LLM API key (Anthropic or OpenAI), Python 3.11+, and basic Python knowledge. No machine learning experience required — you're orchestrating API calls, not training models.

Do I need to know machine learning to build an AI agent?

No. Building an AI agent in 2026 doesn't require ML knowledge. You're orchestrating LLM calls, not training models. It's more like API orchestration than ML engineering.

How long does it take to build a basic agent?

You can have a working agent in 10 minutes with the 20-line loop shown in this tutorial. A production-ready agent with proper error handling and monitoring takes a few days.

Should I use a framework or build from scratch?

Build from scratch first to understand the loop. Once you understand the fundamentals, frameworks like LangGraph and CrewAI add value for complex multi-agent workflows.

How much does it cost to run an AI agent?

A single agent run with Claude Sonnet costs about ₹30–₹80 ($0.36–$0.96). Production deployment including hosting and inference runs around ₹500/month ($6) for a personal tool.

How to build your first AI agent in 2026 (tutorial)

A practical tutorial on building your first AI agent — choosing the right tools, setting up the agent loop, adding tools, and deploying it.

The Anthropic tool use documentation defines the same agent loop pattern — model, tools, and execution loop — as the foundation of production AI agents. This tutorial follows that exact architecture.

TL;DR: An agent is just an LLM in a loop with tools — this 20-line Python loop is all you need to get started. This tutorial builds a code review agent from scratch, covering the core loop, tool definitions, system prompts, and deployment. No ML experience required, just API orchestration.

The ReAct paper (Yao et al., 2022) first formalised this LLM-in-a-loop pattern, showing it dramatically outperforms standard prompting on tasks that require external knowledge and multi-step reasoning.

You’ve used ChatGPT. You’ve maybe used Claude or Copilot to help you code. But building an agent — something that takes actions on its own, loops on feedback, and makes decisions — feels like a different skill entirely.

It’s not. An agent is just an LLM in a loop with tools. That’s it. The magic is in the loop design, not the model.

This tutorial walks through building your first agent: a code review agent that reads files, analyzes them, and produces reports. By the end, you’ll have something running on your machine that does real work.

Key takeaways:

An agent is just an LLM in a loop with tools — the magic is in the loop design, not the model

Building from scratch teaches you the fundamentals before you layer on framework abstractions

A working code review agent can be built in under 50 lines of Python

The hardest part of production agents isn’t the loop — it’s reliability, cost control, and scope management

What is an agent, really?

Here’s the simplest definition I can give:

An agent = LLM + loop + tools

The LLM makes decisions (what to do next)
The loop keeps it going until some condition is met
The tools let it interact with the world (read files, run commands, call APIs)

A chatbot is an LLM that responds once. An agent is an LLM that keeps going — observing, deciding, acting, and repeating until the job is done.

This is what I call the Vertical Agent Method — build narrow, purpose-built agents that replace one specific workflow, not general-purpose assistants. Our code review agent is a perfect example: it does one thing (review code) and is designed specifically for that workflow. The focus is what makes it reliable.

Choosing your stack

In 2026, you have three main approaches to building agents:

Use an agentic IDE (Claude Code, Cursor) — great for coding tasks, limited customization
Use a framework (LangGraph, CrewAI) — good for complex workflows, but adds abstraction overhead
Build from scratch — full control, minimal dependencies, you understand every line

For this tutorial, we’re building from scratch. Not because frameworks are bad, but because you need to understand the loop before you let a framework manage it for you.

What you’ll need

Python 3.11+
An API key from Anthropic or OpenAI (I’ll use Anthropic’s Claude because it’s better at tool use)
Basic Python knowledge

The core agent loop

Here’s the simplest agent loop that actually works:

import json
from anthropic import Anthropic

client = Anthropic()

def run_agent(system_prompt, messages, tools, max_turns=10):
    messages = [{"role": "system", "content": system_prompt}] + messages

    for turn in range(max_turns):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            messages=messages,
            tools=tools,
        )

        messages.append({"role": "assistant", "content": response.content})

        # Check if the model wants to use a tool
        tool_uses = [b for b in response.content if b.type == "tool_use"]
        if not tool_uses:
            # No tool calls — we have a final response
            return response.content[0].text

        # Execute each tool
        for tool_use in tool_uses:
            result = execute_tool(tool_use.name, tool_use.input)
            messages.append({
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": tool_use.id,
                    "content": str(result),
                }]
            })

    return "Max turns reached without completion."

That’s the entire loop. Twenty lines. The agent gets a system prompt, a list of messages, and a set of tools. It calls the LLM, checks if it used a tool, executes the tool if it did, and feeds the result back. Repeat until the LLM gives a final answer.

Adding tools

Tools are just functions with descriptions. The LLM decides which to call based on the description. Here are the tools for our code review agent:

tools = [
    {
        "name": "read_file",
        "description": "Read the contents of a file at the given path",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Path to the file"}
            },
            "required": ["path"]
        }
    },
    {
        "name": "list_directory",
        "description": "List files in a directory",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Directory path"}
            },
            "required": ["path"]
        }
    },
    {
        "name": "run_command",
        "description": "Run a shell command and get its output",
        "input_schema": {
            "type": "object",
            "properties": {
                "command": {"type": "string", "description": "Command to run"}
            },
            "required": ["command"]
        }
    }
]

And the corresponding execute function:

import os
import subprocess

def execute_tool(name, args):
    if name == "read_file":
        with open(args["path"], "r") as f:
            return f.read()
    elif name == "list_directory":
        return "\n".join(os.listdir(args["path"]))
    elif name == "run_command":
        result = subprocess.run(
            args["command"], shell=True, capture_output=True, text=True, timeout=30
        )
        return f"STDOUT:\n{result.stdout}\nSTDERR:\n{result.stderr}"
    return f"Unknown tool: {name}"

The system prompt

The system prompt is where you shape the agent’s behavior. For a code review agent:

system_prompt = """You are a code review agent. Your job is to analyze code and produce a structured review.

For each file you review:
1. Read the file
2. Analyze it for bugs, style issues, and security concerns
3. Note any patterns that could be improved

When you have reviewed all relevant files, produce a final report with:
- Summary of findings (bullet points)
- Critical issues (must fix)
- Warnings (should fix)
- Suggestions (nice to have)

Be thorough but practical. Not every style preference is a bug.
"""

Putting it together

review = run_agent(
    system_prompt=system_prompt,
    messages=[{"role": "user", "content": "Review the code in /path/to/project"}],
    tools=tools,
    max_turns=25,
)
print(review)

When you run this, the agent will:

List the directory to understand the project
Read each relevant file
Run linting or type-checking commands
Produce a structured review

What can go wrong

Building agents from scratch means you encounter every edge case personally. Here are the ones that hit me first:

Infinite loops. The agent keeps calling tools without converging. Fix: set max_turns and log tool call counts.

Blowing through tokens. Reading a 5,000-line file fills the context window fast. Fix: truncate file reads to the first 200 lines, or read specific sections.

Tool call failures. The agent tries to read a file that doesn’t exist. Fix: wrap tool execution in try/except and return a helpful error message.

Cost surprises. One agent run with Sonnet on a medium project costs about ₹30–₹80. For a demo it’s fine. For production, add cost tracking.

# Simple cost tracker
cost_per_input_token = 3e-06  # $0.003 per 1K input tokens (Sonnet)
cost_per_output_token = 15e-06 # $0.015 per 1K output tokens

def track_cost(usage):
    input_cost = usage.input_tokens * cost_per_input_token
    output_cost = usage.output_tokens * cost_per_output_token
    return input_cost + output_cost

Deploying it

For a personal agent, the simplest deployment is a CLI script. But if you want it running as a service:

# review_server.py — accepts requests and runs reviews
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class ReviewRequest(BaseModel):
    repo_path: str
    depth: str = "standard"  # quick, standard, thorough

@app.post("/review")
def run_review(req: ReviewRequest):
    result = run_agent(
        system_prompt=system_prompt,
        messages=[{"role": "user", "content": f"Review {req.repo_path}"}],
        tools=tools,
        max_turns=50 if req.depth == "thorough" else 15,
    )
    return {"review": result}

Deploy this on a Railway or Fly.io instance, and you have a code review API for ₹500/month including inference costs.

What’s next

This agent is basic. It has no memory across sessions, no caching, no parallel tool execution, no streaming. But it works. It reviews real code and produces useful reports.

The step from “prompting an LLM” to “building an agent” is smaller than most developers think. You’re not writing a new paradigm — you’re putting a loop around something you already know how to use.

The hard part is what comes after: making the agent reliable, cost-effective, and actually useful in production. That’s where the next 40 hours go. But the first hour — the one where you write the loop and watch it work — is the most important one. It proves the concept is real.

Related: Best AI Agent Frameworks 2026 — a comparison of LangGraph, CrewAI, and AutoGen for production use.

Also: Cursor vs Claude Code vs Copilot — how AI coding tools compare for daily development work.

Related: OpenAI function calling tutorial: building tools for GPT in 2026 — how to use OpenAI function calling to build tools for your first agent.

Try it yourself

Copy the 20-line loop above, pick a tool (even a simple one like read_file), and run it. If you have an Anthropic API key, you'll have a working agent in 10 minutes. The gap between theory and practice in agents is mostly just running the loop once.

Related: The Vertical Agent Method — the framework behind how we build and ship AI agents.