LangGraph Deep Dive: State Machines, Tools, and Human-in-the-Loop
LangGraph solves a specific problem that other agent frameworks don't: cycles. Most orchestration tools are DAGs (directed acyclic graphs). They flow in one direction. But real agents need loops. They need to call an LLM, check the result, maybe call a tool, check again, decide if more information is needed, loop back, and eventually exit.
That loop is the hard part. LangGraph makes it straightforward by representing your agent as a state machine. Nodes do work. Edges define transitions. State tracks everything the agent knows. Conditional edges let the agent decide where to go next based on what it's learned.
This guide teaches LangGraph from first principles. We'll start with the core concepts, build increasingly complex examples, and finish with a complete research agent that searches the web, evaluates its findings, and iterates until it has enough information to answer. Working code at every step.
Core Concepts: State, Nodes, Edges
LangGraph models agent workflows as graphs. Three components:
State: A dictionary that holds everything the agent knows. Every node reads from and writes to this shared state.
Nodes: Functions that do work. Call an LLM, execute a tool, process data. Each node receives the current state and returns updates.
Edges: Connections between nodes. They define which node runs next. Edges can be unconditional (always follow this path) or conditional (decide based on state).
Let's see the simplest possible example:
from langgraph.graph import StateGraph, START, END
from typing import TypedDict
# 1. Define state schema
class AgentState(TypedDict):
messages: list
# 2. Define a node (just a function)
def greet(state: AgentState) -> dict:
return {"messages": state["messages"] + ["Hello!"]}
# 3. Build the graph
graph = StateGraph(AgentState)
graph.add_node("greet", greet)
graph.add_edge(START, "greet")
graph.add_edge("greet", END)
# 4. Compile and run
app = graph.compile()
result = app.invoke({"messages": []})
print(result)
# {'messages': ['Hello!']}
That's the complete structure. Everything else builds on this pattern.
Understanding State
State is the memory of your agent. It persists across nodes and gets updated as the agent works. You define its shape with TypedDict:
from typing import TypedDict, List, Optional
class ResearchState(TypedDict):
query: str # User's question
search_queries: List[str] # Generated search terms
search_results: List[dict] # Raw results from web
sources: List[str] # Cited URLs
answer: Optional[str] # Final response
iteration: int # Loop counter
Every field you need to track goes here. Nodes read what they need, compute something, and return updates.
State Updates: Overwrite vs Append
By default, returning a key overwrites it:
def update_answer(state):
return {"answer": "New answer"} # Replaces previous value
For lists where you want to append (like message history), use the Annotated pattern with operator.add:
from typing import Annotated
import operator
class ChatState(TypedDict):
messages: Annotated[list, operator.add] # Appends instead of replaces
Now returning {"messages": [new_message]} adds to the list instead of replacing it.
The add_messages Helper
For chat-style agents, LangGraph provides add_messages:
from langgraph.graph.message import add_messages
from typing import Annotated
class State(TypedDict):
messages: Annotated[list, add_messages]
This handles deduplication, ordering, and the specific format LangChain messages expect.
Nodes: Where Work Happens
A node is a function that:
- Receives the current state
- Does some computation
- Returns a dictionary of state updates
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
def call_llm(state: AgentState) -> dict:
"""Node that calls the LLM"""
messages = state["messages"]
response = llm.invoke(messages)
return {"messages": [response]}
def check_result(state: AgentState) -> dict:
"""Node that evaluates the response"""
last_message = state["messages"][-1]
# Some evaluation logic
return {"needs_more_info": len(last_message.content) < 100}
Nodes can do anything: LLM calls, API requests, database queries, calculations. The only requirement is they take state and return updates.
Adding Nodes to the Graph
graph = StateGraph(AgentState)
graph.add_node("call_llm", call_llm)
graph.add_node("check_result", check_result)
graph.add_node("search_web", search_web)
Node names are strings. Use them to define edges.
Edges: Control Flow
Edges define how the agent moves between nodes.
Unconditional Edges
Always go from A to B:
graph.add_edge("call_llm", "check_result") # After LLM, always check
graph.add_edge(START, "call_llm") # Start at call_llm
graph.add_edge("format_answer", END) # End after formatting
Conditional Edges
Choose the next node based on state:
def route_after_check(state: AgentState) -> str:
"""Decide where to go based on state"""
if state.get("needs_more_info"):
return "search_web"
return "format_answer"
graph.add_conditional_edges(
"check_result", # Source node
route_after_check, # Routing function
{
"search_web": "search_web", # If function returns "search_web"
"format_answer": "format_answer" # If function returns "format_answer"
}
)
The routing function receives state and returns a string matching one of the destination keys. This is how agents make decisions.
The Complete Pattern
from langgraph.graph import StateGraph, START, END
graph = StateGraph(AgentState)
# Add all nodes
graph.add_node("generate_query", generate_query)
graph.add_node("search", search)
graph.add_node("evaluate", evaluate)
graph.add_node("answer", answer)
# Entry point
graph.add_edge(START, "generate_query")
# Unconditional edges
graph.add_edge("generate_query", "search")
graph.add_edge("search", "evaluate")
# Conditional edge (the loop!)
graph.add_conditional_edges(
"evaluate",
should_continue,
{
"continue": "generate_query", # Loop back
"finish": "answer" # Exit
}
)
# Exit point
graph.add_edge("answer", END)
# Compile
app = graph.compile()
This creates a loop: generate query → search → evaluate → (continue? loop back : answer).
Building a ReAct Agent
The ReAct pattern (Reasoning + Acting) is the most common agent architecture. The agent:
- Thinks about what to do
- Decides whether to use a tool
- If yes, executes the tool
- Loops back to think about the result
- Eventually responds
Define Tools
from langchain_core.tools import tool
@tool
def search_web(query: str) -> str:
"""Search the web for information."""
# In production, use Tavily, SerpAPI, etc.
return f"Search results for: {query}"
@tool
def get_weather(location: str) -> str:
"""Get current weather for a location."""
return f"Weather in {location}: 72°F, sunny"
tools = [search_web, get_weather]
Bind Tools to LLM
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
llm_with_tools = llm.bind_tools(tools)
Define the Agent Node
def agent(state: AgentState) -> dict:
"""The reasoning node - decides what to do next"""
messages = state["messages"]
response = llm_with_tools.invoke(messages)
return {"messages": [response]}
Define the Tool Executor Node
from langgraph.prebuilt import ToolNode
tool_node = ToolNode(tools)
ToolNode handles parsing tool calls from the LLM response and executing them.
Define the Router
def should_use_tool(state: AgentState) -> str:
"""Check if the last message has tool calls"""
last_message = state["messages"][-1]
if last_message.tool_calls:
return "tools"
return "end"
Wire It Together
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from typing import Annotated, TypedDict
class AgentState(TypedDict):
messages: Annotated[list, add_messages]
# Build graph
graph = StateGraph(AgentState)
graph.add_node("agent", agent)
graph.add_node("tools", tool_node)
# Edges
graph.add_edge(START, "agent")
graph.add_conditional_edges(
"agent",
should_use_tool,
{"tools": "tools", "end": END}
)
graph.add_edge("tools", "agent") # After tool, back to agent
app = graph.compile()
Run It
result = app.invoke({
"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}]
})
for message in result["messages"]:
print(f"{message.type}: {message.content}")
The agent will:
- Receive the question
- Decide to call
get_weather - Execute the tool
- Loop back with the result
- Formulate the final answer
- Exit
The Prebuilt ReAct Agent
LangGraph provides create_react_agent for common cases:
from langgraph.prebuilt import create_react_agent
app = create_react_agent(llm, tools)
result = app.invoke({
"messages": [{"role": "user", "content": "Search for LangGraph tutorials"}]
})
This handles all the wiring automatically. Use it when the default ReAct pattern fits. Build custom graphs when you need more control.
Persistence: Memory That Survives
By default, state only exists during a single invocation. For multi-turn conversations or long-running tasks, you need persistence.
Checkpointers
A checkpointer saves state after every node execution:
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver() # In-memory (dev only)
app = graph.compile(checkpointer=checkpointer)
For production:
from langgraph.checkpoint.postgres import PostgresSaver
checkpointer = PostgresSaver.from_conn_string("postgresql://...")
app = graph.compile(checkpointer=checkpointer)
Thread IDs
Each conversation gets a thread_id. Pass it in the config:
config = {"configurable": {"thread_id": "user-123-conversation-1"}}
# First message
result = app.invoke(
{"messages": [{"role": "user", "content": "Hi, I'm Alice"}]},
config
)
# Second message - agent remembers the first
result = app.invoke(
{"messages": [{"role": "user", "content": "What's my name?"}]},
config
)
# Agent knows it's Alice because state was persisted
Time Travel
With checkpoints, you can inspect or revert to any previous state:
# Get all checkpoints for a thread
checkpoints = list(checkpointer.list(config))
# Load a specific checkpoint
old_state = checkpointer.get(config, checkpoint_id="abc123")
# Resume from that point
result = app.invoke(
{"messages": [{"role": "user", "content": "Try something different"}]},
{"configurable": {"thread_id": "...", "checkpoint_id": "abc123"}}
)
This is invaluable for debugging and error recovery.
Human-in-the-Loop
Real agents need human oversight. LangGraph provides two mechanisms: breakpoints and the interrupt function.
Static Breakpoints
Pause before or after specific nodes:
app = graph.compile(
checkpointer=checkpointer,
interrupt_before=["execute_dangerous_action"],
interrupt_after=["generate_plan"]
)
The graph pauses, saves state, and waits. You inspect, approve, and resume:
config = {"configurable": {"thread_id": "task-1"}}
# Run until interrupt
result = app.invoke({"messages": [...]}, config)
# Graph pauses after "generate_plan"
# Check what the agent wants to do
print(result["plan"])
# Resume execution
result = app.invoke(None, config) # None continues from checkpoint
Dynamic Interrupts with interrupt()
For more control, pause from within a node:
from langgraph.types import interrupt, Command
def review_action(state: AgentState) -> dict:
"""Pause and ask human for approval"""
proposed_action = state["proposed_action"]
# This pauses execution and returns to the caller
human_decision = interrupt({
"question": f"Approve this action? {proposed_action}",
"options": ["approve", "reject", "edit"]
})
if human_decision == "reject":
return {"status": "cancelled"}
return {"approved": True}
Resume with Command:
from langgraph.types import Command
# Initial run (pauses at interrupt)
result = app.invoke({"messages": [...]}, config)
# Check what it's asking
print(result["__interrupt__"]) # Shows the interrupt payload
# Resume with human decision
result = app.invoke(
Command(resume="approve"), # Pass the decision
config
)
The key insight: interrupt() saves the entire execution state. The agent can wait minutes, hours, or days. When you resume, it continues exactly where it left off.
Practical Example: Approve Before Write
def write_file(state: AgentState) -> dict:
"""Write to file with human approval"""
filename = state["filename"]
content = state["content"]
# Ask for approval
decision = interrupt({
"action": "write_file",
"filename": filename,
"content_preview": content[:500],
"message": "Approve this file write?"
})
if decision != "approve":
return {"status": "write_cancelled"}
# Proceed with write
with open(filename, "w") as f:
f.write(content)
return {"status": "file_written", "path": filename}
This pattern works for any risky operation: database writes, API calls, emails, payments.
Building a Research Agent
Let's build something real: an agent that researches a topic by searching the web, evaluating results, and iterating until it has enough information.
State Definition
from typing import TypedDict, List, Optional, Annotated
from langgraph.graph.message import add_messages
class ResearchState(TypedDict):
messages: Annotated[list, add_messages]
query: str # Original question
search_queries: List[str] # Generated search terms
search_results: List[dict] # Results from web searches
sources: List[str] # URLs for citations
gaps: List[str] # Identified knowledge gaps
iteration: int # Current loop count
max_iterations: int # Limit to prevent infinite loops
final_answer: Optional[str]
Node 1: Generate Search Queries
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
def generate_queries(state: ResearchState) -> dict:
"""Generate search queries based on the question and any gaps"""
query = state["query"]
gaps = state.get("gaps", [])
prompt = f"""Generate 3 search queries to research this question:
Question: {query}
{"Knowledge gaps to address: " + str(gaps) if gaps else ""}
Return queries as a JSON list of strings."""
response = llm.invoke([{"role": "user", "content": prompt}])
# Parse response (in production, use structured output)
import json
queries = json.loads(response.content)
return {"search_queries": queries}
Node 2: Execute Web Search
from langchain_community.tools import TavilySearchResults
search_tool = TavilySearchResults(max_results=3)
def search_web(state: ResearchState) -> dict:
"""Execute searches for all queries"""
queries = state["search_queries"]
all_results = []
sources = []
for query in queries:
results = search_tool.invoke(query)
for r in results:
all_results.append({
"query": query,
"title": r.get("title", ""),
"content": r.get("content", ""),
"url": r.get("url", "")
})
if r.get("url"):
sources.append(r["url"])
return {
"search_results": state.get("search_results", []) + all_results,
"sources": list(set(state.get("sources", []) + sources))
}
Node 3: Evaluate and Reflect
def evaluate_results(state: ResearchState) -> dict:
"""Analyze results and identify gaps"""
query = state["query"]
results = state["search_results"]
iteration = state.get("iteration", 0)
results_text = "\n\n".join([
f"Source: {r['url']}\n{r['content'][:500]}"
for r in results[-9:] # Last 9 results (3 queries × 3 results)
])
prompt = f"""Evaluate if we have enough information to answer this question:
Question: {query}
Search Results:
{results_text}
Respond with JSON:
{{
"sufficient": true/false,
"gaps": ["list of missing information if not sufficient"],
"summary": "brief summary of what we know"
}}"""
response = llm.invoke([{"role": "user", "content": prompt}])
import json
evaluation = json.loads(response.content)
return {
"gaps": evaluation.get("gaps", []),
"iteration": iteration + 1,
"_sufficient": evaluation["sufficient"] # Used for routing
}
Node 4: Generate Final Answer
def generate_answer(state: ResearchState) -> dict:
"""Synthesize research into final answer"""
query = state["query"]
results = state["search_results"]
sources = state["sources"]
results_text = "\n\n".join([
f"[{i+1}] {r['content'][:1000]}"
for i, r in enumerate(results[:12])
])
prompt = f"""Based on this research, answer the question.
Question: {query}
Research:
{results_text}
Requirements:
- Provide a comprehensive answer
- Include citations using [1], [2], etc.
- Be factual and balanced"""
response = llm.invoke([{"role": "user", "content": prompt}])
# Add source list
source_list = "\n\nSources:\n" + "\n".join([
f"[{i+1}] {url}" for i, url in enumerate(sources[:12])
])
return {"final_answer": response.content + source_list}
Routing Function
def should_continue_research(state: ResearchState) -> str:
"""Decide whether to continue searching or finalize"""
# Check if we have enough info
if state.get("_sufficient", False):
return "answer"
# Check iteration limit
if state.get("iteration", 0) >= state.get("max_iterations", 3):
return "answer"
# Check if we have gaps to address
if state.get("gaps"):
return "search_more"
return "answer"
Assemble the Graph
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
# Build graph
graph = StateGraph(ResearchState)
# Add nodes
graph.add_node("generate_queries", generate_queries)
graph.add_node("search", search_web)
graph.add_node("evaluate", evaluate_results)
graph.add_node("answer", generate_answer)
# Entry
graph.add_edge(START, "generate_queries")
# Flow
graph.add_edge("generate_queries", "search")
graph.add_edge("search", "evaluate")
# Conditional: continue or finish
graph.add_conditional_edges(
"evaluate",
should_continue_research,
{
"search_more": "generate_queries", # Loop back
"answer": "answer" # Exit to answer
}
)
# Exit
graph.add_edge("answer", END)
# Compile with persistence
checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)
Run the Research Agent
config = {"configurable": {"thread_id": "research-1"}}
result = app.invoke({
"messages": [],
"query": "What are the latest developments in quantum computing?",
"search_queries": [],
"search_results": [],
"sources": [],
"gaps": [],
"iteration": 0,
"max_iterations": 3,
"final_answer": None
}, config)
print(result["final_answer"])
The agent will:
- Generate initial search queries
- Search the web
- Evaluate if the results are sufficient
- If not, identify gaps and search again
- Loop up to 3 times
- Generate final answer with citations
Visualizing the Graph
from IPython.display import Image, display
display(Image(app.get_graph().draw_mermaid_png()))
This outputs a flowchart showing nodes and edges, helpful for debugging.
Streaming
For long-running agents, stream intermediate results:
# Stream all state updates
for chunk in app.stream(initial_state, config):
print(chunk)
# Stream specific events
for event in app.stream(initial_state, config, stream_mode="updates"):
node_name = list(event.keys())[0]
node_output = event[node_name]
print(f"{node_name}: {node_output}")
This lets you show progress to users as the agent works.
Production Patterns
Error Handling
Wrap node logic in try/except and update state with error info:
def search_with_retry(state: ResearchState) -> dict:
"""Search with error handling"""
try:
results = search_tool.invoke(state["search_queries"][0])
return {"search_results": results, "error": None}
except Exception as e:
return {
"search_results": [],
"error": str(e),
"retry_count": state.get("retry_count", 0) + 1
}
Add conditional edges for error recovery:
def route_after_search(state):
if state.get("error") and state.get("retry_count", 0) < 3:
return "retry"
elif state.get("error"):
return "fallback"
return "continue"
Parallel Execution
LangGraph can run independent nodes in parallel. When using conditional edges with Send, you can spawn multiple branches:
from langgraph.types import Send
def distribute_searches(state):
"""Create parallel search tasks"""
queries = state["search_queries"]
return [Send("search_single", {"query": q}) for q in queries]
graph.add_conditional_edges("generate_queries", distribute_searches)
Each Send creates a parallel branch. Results merge back into state.
Subgraphs
For complex agents, compose smaller graphs:
# Define a subgraph
search_graph = StateGraph(SearchState)
# ... add nodes and edges ...
search_app = search_graph.compile()
# Use it as a node in parent graph
def search_node(state):
result = search_app.invoke({"query": state["query"]})
return {"search_results": result["results"]}
parent_graph.add_node("search", search_node)
This keeps code modular and testable.
Common Pitfalls
1. Forgetting the Checkpointer for Interrupts
Interrupts require persistence. Without a checkpointer, the state is lost:
# WRONG - interrupts won't work
app = graph.compile()
# RIGHT
app = graph.compile(checkpointer=MemorySaver())
2. Infinite Loops
Always include exit conditions:
def should_continue(state):
if state["iteration"] >= state["max_iterations"]:
return "end" # Force exit
# ... rest of logic
3. State Schema Mismatches
Nodes must return keys that exist in your state schema:
class State(TypedDict):
messages: list
def bad_node(state):
return {"invalid_key": "value"} # Will fail!
4. Not Handling Tool Errors
Tools can fail. Handle it:
def execute_tool(state):
try:
result = tool.invoke(state["tool_input"])
return {"tool_result": result}
except Exception as e:
return {"tool_error": str(e), "tool_result": None}
When to Use LangGraph
Good fit:
- Agents that need loops (search → evaluate → search again)
- Multi-step workflows with branching
- Human-in-the-loop requirements
- Long-running tasks needing persistence
- Complex tool orchestration
Maybe overkill:
- Simple LLM calls without tools
- Linear pipelines (A → B → C)
- Stateless request/response patterns
For simpler cases, LangChain's basic chains or direct API calls might be cleaner. LangGraph shines when your agent logic is genuinely complex.
Multi-Agent Systems
When one agent isn't enough, LangGraph supports multi-agent architectures.
Supervisor Pattern
One agent coordinates multiple specialist agents:
from typing import Literal
from langgraph.types import Command
class SupervisorState(TypedDict):
messages: Annotated[list, add_messages]
next_agent: str
def supervisor(state: SupervisorState) -> Command:
"""Decide which agent should handle the task"""
messages = state["messages"]
prompt = f"""You are a supervisor managing these agents:
- researcher: searches the web for information
- analyst: analyzes data and provides insights
- writer: writes reports and summaries
Based on the conversation, which agent should act next?
Or respond FINISH if the task is complete.
Respond with just the agent name or FINISH."""
response = llm.invoke([{"role": "system", "content": prompt}] + messages)
next_agent = response.content.strip().lower()
if next_agent == "finish":
return Command(goto=END)
return Command(goto=next_agent)
# Build the graph
graph = StateGraph(SupervisorState)
graph.add_node("supervisor", supervisor)
graph.add_node("researcher", researcher_agent)
graph.add_node("analyst", analyst_agent)
graph.add_node("writer", writer_agent)
# All agents report back to supervisor
graph.add_edge("researcher", "supervisor")
graph.add_edge("analyst", "supervisor")
graph.add_edge("writer", "supervisor")
# Supervisor routes to agents
graph.add_edge(START, "supervisor")
Hierarchical Teams
For complex tasks, nest supervisors:
Top Supervisor
├── Research Team Supervisor
│ ├── Web Researcher
│ └── Document Analyst
└── Writing Team Supervisor
├── Copywriter
└── Editor
Each team is its own subgraph. The top supervisor delegates to team supervisors.
Message Passing Between Agents
Agents communicate through the shared state:
class MultiAgentState(TypedDict):
messages: Annotated[list, add_messages]
research_findings: List[str]
analysis_results: dict
draft: str
feedback: List[str]
def researcher(state):
# Do research
findings = perform_research(state["messages"][-1].content)
return {"research_findings": findings}
def analyst(state):
# Analyze the research
findings = state["research_findings"]
analysis = analyze(findings)
return {"analysis_results": analysis}
def writer(state):
# Write based on analysis
analysis = state["analysis_results"]
draft = write_report(analysis)
return {"draft": draft}
Each agent reads from what previous agents wrote and adds its contribution.
Advanced Tool Patterns
Tools That Modify State
Sometimes tools need to update agent state directly:
from langgraph.types import Command
def counter_tool(state):
"""Tool that increments a counter in state"""
current = state.get("tool_call_count", 0)
# Return Command to update state AND specify next node
return Command(
update={"tool_call_count": current + 1},
goto="agent" # Go back to agent after
)
Dynamic Tool Loading
Load tools based on context:
def select_tools(state):
"""Dynamically select which tools to make available"""
topic = state.get("topic", "general")
if topic == "coding":
return [code_executor, linter, git_tool]
elif topic == "research":
return [web_search, arxiv_search, wiki_search]
else:
return [web_search, calculator]
def agent_with_dynamic_tools(state):
tools = select_tools(state)
llm_with_tools = llm.bind_tools(tools)
response = llm_with_tools.invoke(state["messages"])
return {"messages": [response]}
Tool Approval Workflow
Require human approval for sensitive tools:
def execute_tool_with_approval(state):
tool_call = state["pending_tool_call"]
# Check if this tool needs approval
sensitive_tools = ["delete_file", "send_email", "make_payment"]
if tool_call["name"] in sensitive_tools:
# Interrupt for approval
decision = interrupt({
"tool": tool_call["name"],
"args": tool_call["args"],
"message": "This action requires approval"
})
if decision != "approve":
return {"tool_result": "Action cancelled by user"}
# Execute the tool
result = execute(tool_call)
return {"tool_result": result}
Debugging and Observability
LangSmith Integration
Track every step of your agent:
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-key"
# Your graph runs as normal, but everything is traced
result = app.invoke({"messages": [...]})
LangSmith shows:
- Every node execution
- Token usage per LLM call
- Tool inputs and outputs
- State changes at each step
- Latency breakdown
Inspecting State
Check state at any point:
# Get current state for a thread
state = app.get_state(config)
print(state.values) # Current state values
print(state.next) # Next node(s) to execute
# Get state history
for state in app.get_state_history(config):
print(f"Step {state.step}: {state.next}")
Custom Logging
Add logging inside nodes:
import logging
logger = logging.getLogger("research_agent")
def search_web(state):
logger.info(f"Searching for: {state['search_queries']}")
results = perform_search(state["search_queries"])
logger.info(f"Found {len(results)} results")
logger.debug(f"Results: {results}")
return {"search_results": results}
Performance Optimization
Caching
Cache expensive operations:
from functools import lru_cache
@lru_cache(maxsize=100)
def cached_search(query: str) -> str:
return search_tool.invoke(query)
def search_node(state):
results = []
for query in state["queries"]:
result = cached_search(query) # Uses cache if available
results.append(result)
return {"results": results}
Batching LLM Calls
When possible, batch multiple calls:
def evaluate_multiple(state):
"""Evaluate all results in one LLM call instead of many"""
results = state["search_results"]
prompt = f"""Evaluate each of these search results:
{json.dumps(results, indent=2)}
For each result, provide:
- relevance (1-10)
- key_facts extracted
Return as JSON array."""
response = llm.invoke([{"role": "user", "content": prompt}])
evaluations = json.loads(response.content)
return {"evaluations": evaluations}
Limiting Context Size
For long conversations, trim history:
from langchain_core.messages import trim_messages
def agent(state):
messages = state["messages"]
# Keep only recent messages + system message
trimmed = trim_messages(
messages,
max_tokens=4000,
strategy="last",
token_counter=llm,
include_system=True
)
response = llm.invoke(trimmed)
return {"messages": [response]}
Testing LangGraph Agents
Unit Testing Nodes
Test nodes in isolation:
import pytest
def test_generate_queries():
state = {
"query": "What is quantum computing?",
"gaps": []
}
result = generate_queries(state)
assert "search_queries" in result
assert len(result["search_queries"]) >= 1
assert all(isinstance(q, str) for q in result["search_queries"])
def test_evaluate_with_sufficient_info():
state = {
"query": "Simple question",
"search_results": [{"content": "Complete answer..."}],
"iteration": 0
}
result = evaluate_results(state)
assert result["_sufficient"] == True
Integration Testing
Test the full graph:
def test_research_agent_completes():
app = build_research_agent()
result = app.invoke({
"query": "What is Python?",
"max_iterations": 2,
# ... other initial state
})
assert result["final_answer"] is not None
assert len(result["sources"]) > 0
assert result["iteration"] <= 2
def test_research_agent_handles_no_results():
# Mock search to return empty
with patch("search_tool.invoke", return_value=[]):
result = app.invoke({...})
# Should still produce an answer (possibly "not found")
assert result["final_answer"] is not None
Testing Interrupts
def test_interrupt_pauses_execution():
app = build_agent_with_approval()
checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "test-1"}}
# Run until interrupt
result = app.invoke(
{"task": "delete important file"},
config
)
# Should be interrupted
assert "__interrupt__" in result
# Resume with rejection
result = app.invoke(Command(resume="reject"), config)
# File should not be deleted
assert result["status"] == "cancelled"
Summary
LangGraph models agents as state machines:
- State: Everything the agent knows, updated as it works
- Nodes: Functions that do work and update state
- Edges: Transitions between nodes, conditional or fixed
- Checkpointers: Persistence for long-running tasks
- Interrupts: Human oversight at any point
The power comes from conditional edges creating loops. The agent can reason, act, observe, and decide whether to continue or exit. This matches how humans actually solve problems: try something, check if it worked, adjust, repeat.
Start with the prebuilt create_react_agent for simple tool use. Graduate to custom graphs when you need specific control over the flow. Add persistence when conversations span multiple sessions. Add interrupts when humans need to stay in the loop.
For teams building production agents that need reliable evaluation, Prem Studio provides fine-tuning and evaluation tools that integrate with agentic workflows built on any framework.