Home / AI Arena / Building AI Agents / Introduction to AI Agents

Introduction to AI Agents

4 min read ai-arena LangChain LangGraph Python

All code for this tutorial series is on GitHub: github.com/achintmehta/langchain

What problem does LangChain solve?

When you call an LLM directly, whether via the OpenAI API, a local model, or anything else, you get back a string. That is fine for simple one-shot tasks, but most real applications need more than that. You might want to:

Feed the model a document that doesn't fit in its context window, so you need to break it up, embed it, and retrieve only the relevant parts.
Chain several LLM calls together, first classify the question, then route it to a specialist prompt, then format the output.
Let the model decide which tool to call (a web search, a database query, a calculator) and loop until it is satisfied with the result.
Pause execution and ask a human to approve a tool call before it runs.

Doing any of that from scratch means writing a lot of plumbing: message format adapters, retry logic, streaming handlers, token counters, caching layers. LangChain packages all of that plumbing so you can focus on what your application actually does.

LangChain vs LangGraph

LangChain is the foundational library. It gives you:

Unified interfaces for talking to different LLM providers (OpenAI, Anthropic, local models via llama.cpp or LM Studio, all through the same API).
A composable pipeline syntax called LCEL (LangChain Expression Language) that lets you wire prompts, models, and output parsers together with the | operator.
A large ecosystem of integrations: document loaders, text splitters, embedding models, vector databases, tools, and more.

LangGraph sits on top of LangChain and handles the harder problem of control flow. It lets you model your application as a graph, nodes do work (call an LLM, run a tool, check a condition), edges define what happens next, and conditional edges let the LLM decide the path at runtime. This is what you use when your application needs loops, branching, parallel steps, or checkpointing.

If LangChain is the toolkit, LangGraph is the architecture. Most serious LLM applications end up needing both.

Using a local LLM

The examples in the companion repository are written to run against a local LLM server rather than a cloud API. There are two common options:

llama-server (from the llama.cpp project) exposes an OpenAI-compatible HTTP API on port 8080 by default:

llama-server -m /path/to/your/model.gguf --port 8080 --host 0.0.0.0

LM Studio is a GUI application that lets you download and run models, then click "Start Server" to expose the same OpenAI-compatible API on port 1234.

The examples target one or the other. Because both speak the OpenAI protocol, you only need to change the base_url in your code to switch between them, or to switch to the actual OpenAI API by removing the base_url entirely.

Setting up your environment

Python 3.11 or later is recommended. Create a virtual environment and install the dependencies:

python -m venv venv
source venv/bin/activate       # on Windows: venv\Scripts\activate

pip install langchain \
            langchain-openai \
            langchain-community \
            langchain-huggingface \
            langgraph \
            langchain-postgres \
            sentence-transformers

Clone the repository to follow along with the code examples:

git clone https://github.com/achintmehta/langchain.git
cd langchain

Your first LLM call

simple_example.py is the simplest possible program, it creates a ChatOpenAI client pointed at a local server and sends one message:

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

llm = ChatOpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-needed",
    model="llama"
)

response = llm.invoke([HumanMessage(content="What is the capital of France?")])
print(response.content)

A few things to notice here. First, the api_key is set to "not-needed", local servers don't authenticate, but the LangChain client requires the parameter to be present. Second, invoke takes a list of messages, not a plain string. This is because modern LLMs are chat models that expect a conversation history, not a raw completion prompt. Third, the response is an AIMessage object; .content gives you the text.

That is really all there is to the basic call. Everything else LangChain provides is building on top of this foundation, composing calls, managing prompts, hooking in retrieval, routing between agents, and so on.

What's next

The rest of this tutorial series goes through each major concept in turn:

Core LangChain Concepts, ChatOpenAI vs OpenAI, LCEL pipelines, and invoke/batch/stream
Chunking Strategies, how to break documents into pieces suitable for retrieval
Embeddings and Vector Databases, turning chunks into vectors and storing them for similarity search
RAG Strategies, patterns for retrieval-augmented generation
Query Transformation, making queries smarter before retrieval
Cognitive Architectures with LangGraph, building agents, multi-agent systems, and human-in-the-loop workflows
Guardrails, validating inputs and outputs to keep applications safe