Agentic AI Frameworks: Mastering LangChain/LangGraph for Smart Agents
1. Introduction to Agentic AI
The world of Artificial Intelligence is evolving at an unprecedented pace. We’re moving beyond simple chatbots and static question-answering systems towards intelligent entities that can think, plan, use tools, and even collaborate to achieve complex goals. This is the realm of Agentic AI.
1.1. What are AI Agents?
Imagine a digital assistant that doesn’t just answer your questions but understands your intent, plans a series of steps to achieve it, uses tools (like searching the web or interacting with an API) to gather information or perform actions, and learns from its experiences. That’s an AI agent.
Unlike a single LLM call that gives a one-off response, an AI agent operates in a deliberative loop:
- Perceive: Takes in information (user input, tool outputs).
- Reason: Interprets the information, plans, and decides on the next step.
- Act: Executes a chosen action (e.g., uses a tool, asks for clarification, generates a final answer).
- Observe: Receives feedback from the action.
- Reflect: Uses observations to refine its understanding and plan for the next iteration.
This continuous cycle allows agents to tackle multi-step problems that LLMs alone cannot.
1.2. Key Components of an AI Agent
Every effective AI agent, regardless of its complexity, relies on four fundamental components working together:
1.2.1. Large Language Model (LLM)
The brain of the agent. The LLM is responsible for:
- Understanding: Interpreting user queries and task descriptions.
- Reasoning: Generating thoughts, plans, and decision-making.
- Action Selection: Deciding which tool to use, if any.
- Output Generation: Crafting the final response or an intermediate thought.
1.2.2. Memory
Memory gives the agent context and enables stateful interactions. It’s how an agent “remembers” past conversations, previous observations, intermediate results, or learned facts. Without memory, each interaction would be isolated.
1.2.3. Tools
Tools are the arms and legs of the agent. They extend the LLM’s capabilities beyond its training data, allowing it to:
- Access real-time information: (e.g., web search, stock prices).
- Perform calculations: (e.g., a calculator tool).
- Interact with external systems: (e.g., APIs, databases, UI automation).
- Execute code: (e.g., Python interpreter).
1.2.4. Agent Executor / Planning
This is the orchestrator that drives the agent’s deliberative loop. It’s responsible for:
- Managing the sequence of thoughts, actions, and observations.
- Invoking the LLM for reasoning and action selection.
- Calling the appropriate tools.
- Handling tool outputs and feeding them back to the LLM.
- Deciding when the task is complete.
These components are what agentic AI frameworks like LangChain, LangGraph, and CrewAI help us manage and build upon.
2. Getting Started with LangChain: Building Your First Agent
LangChain is a leading framework for developing LLM-powered applications. It provides a structured way to combine LLMs with other components, especially tools and memory, to create intelligent agents.
2.1. Setting Up Your Environment
Let’s prepare our Python environment.
2.1.1. Installation
We recommend using a virtual environment to manage dependencies.
# 1. Create a virtual environment
python -m venv agentic_env
# 2. Activate the virtual environment
# On macOS/Linux:
source agentic_env/bin/activate
# On Windows:
# .\agentic_env\Scripts\activate
# 3. Install core LangChain packages and a general LLM provider (e.g., OpenAI)
pip install langchain langchain-openai python-dotenv
# 4. For web search tool, install duckduckgo-search
pip install -U langchain-community duckduckgo-search
# 5. For LangGraph (later in the document)
pip install langgraph
# 6. For CrewAI (later in the document)
pip install crewai
2.1.2. API Keys (for Hosted LLMs)
If you plan to use hosted LLMs like OpenAI’s GPT models, you’ll need an API key. Store it securely in a .env file in your project’s root directory.
Create a file named .env:
OPENAI_API_KEY="your_openai_api_key_here"
# ANTHROPIC_API_KEY="your_anthropic_api_key_here" # If using Anthropic
In your Python code, you’ll load this file:
import os
from dotenv import load_dotenv
load_dotenv() # This loads variables from .env into your environment
# openai_api_key = os.getenv("OPENAI_API_KEY") # You usually don't need to explicitly assign, LangChain will find it.
2.1.3. Local LLM Integration with Ollama
For privacy, cost savings, or offline capabilities, running LLMs locally with Ollama is excellent.
Step 1: Install Ollama Download and install Ollama from its official website: https://ollama.com/
Step 2: Pull an LLM Model After installation, open your terminal and pull a model. Llama 3 is a good all-rounder.
ollama pull llama3
# You can also pull other models like 'mistral', 'gemma', etc.
Step 3: Install LangChain Ollama Integration
pip install langchain-community
Now you can use Ollama models in LangChain:
from langchain_community.chat_models import ChatOllama
# Ensure 'llama3' is pulled and running in Ollama (ollama run llama3 in a separate terminal)
local_llm = ChatOllama(model="llama3", temperature=0)
# Test it
# print(local_llm.invoke("What is the capital of Canada?"))
Throughout this document, I’ll generally use ChatOpenAI for simplicity, but you can almost always swap it with ChatOllama by uncommenting the relevant lines and ensuring Ollama is running.
2.2. Hands-on: Basic LLM Interaction (Hello, LangChain!)
Let’s start with the absolute basics: making a single call to an LLM using LangChain. This demonstrates the LLM and Prompt components.
Scenario: Ask an LLM a simple question and get a response.
basic_llm_interaction.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI # For OpenAI
# from langchain_community.chat_models import ChatOllama # For Ollama
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Load environment variables from .env file (for OpenAI API key)
load_dotenv()
def run_basic_llm_interaction():
print("--- Running Basic LLM Interaction ---")
# 1. Initialize the LLM
# Using OpenAI's GPT-3.5-turbo. Make sure OPENAI_API_KEY is set in .env.
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# --- To use a local Ollama model instead, uncomment the following line ---
# Make sure Ollama is installed and 'llama3' is pulled (ollama pull llama3)
# And run 'ollama run llama3' in a separate terminal before executing this script.
# llm = ChatOllama(model="llama3", temperature=0)
# -------------------------------------------------------------------------
# 2. Define a simple prompt template
# This guides the LLM on how to respond.
prompt = ChatPromptTemplate.from_messages([
("system", "You are a concise and helpful AI assistant."),
("user", "{input}")
])
# 3. Define an output parser
# Converts the LLM's raw output into a simple string.
output_parser = StrOutputParser()
# 4. Create a simple chain
# Chains connect components: Prompt -> LLM -> Output Parser
chain = prompt | llm | output_parser
# 5. Invoke the chain with a user question
question = "What is the capital of France?"
print(f"\nUser Question: {question}")
response = chain.invoke({"input": question})
print(f"AI Response: {response}")
# Another example
question_two = "What is the primary benefit of using Python for data science?"
print(f"\nUser Question: {question_two}")
response_two = chain.invoke({"input": question_two})
print(f"AI Response: {response_two}")
if __name__ == "__main__":
run_basic_llm_interaction()
To Run:
- Save the code as
basic_llm_interaction.py. - Make sure your
.envfile is set up (if using OpenAI) or Ollama is running (if using Ollama). - Execute from your terminal:
python basic_llm_interaction.py
Expected Output (similar to):
--- Running Basic LLM Interaction ---
User Question: What is the capital of France?
AI Response: The capital of France is Paris.
User Question: What is the primary benefit of using Python for data science?
AI Response: Its extensive ecosystem of libraries (like NumPy, Pandas, Scikit-learn, TensorFlow, PyTorch) and its readability.
Key Takeaway: Even simple LLM interactions benefit from LangChain’s modular Prompt, LLM, and Output Parser components, which can be easily chained together using the | operator.
2.3. Hands-on: Building an Agent with Built-in Tools (Web Search)
This is where the power of agents begins! We’ll enable our LLM to use a web search tool to answer questions it doesn’t know off-hand. This introduces the Tools and Agent Executor concepts.
Scenario: Create an agent that can answer current event questions by performing a web search.
web_search_agent.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun # Our web search tool
from langchain.agents import AgentExecutor, create_react_agent
from langchain import hub # To pull standard agent prompts
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# Load environment variables
load_dotenv()
def run_web_search_agent():
print("--- Running Web Search Agent ---")
# 1. Initialize the LLM (for reasoning and answering)
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
# 2. Define the tools the agent can use
# DuckDuckGoSearchRun is a simple, no-API-key-needed search tool.
search_tool = DuckDuckGoSearchRun(name="duckduckgo_search")
tools = [search_tool]
# 3. Get the standard ReAct agent prompt from LangChain Hub
# This prompt instructs the LLM on how to "Think", "Act", "Observe", and "Final Answer".
prompt = hub.pull("hwchase17/react")
# 4. Create the agent
# `create_react_agent` is a factory function that sets up the LLM, tools, and prompt
# to follow the ReAct (Reasoning and Acting) pattern.
agent = create_react_agent(llm, tools, prompt)
# 5. Create the Agent Executor
# The AgentExecutor is the runtime that drives the agent's decision-making loop.
# `verbose=True` shows the agent's internal thoughts and actions, which is crucial for understanding!
# `handle_parsing_errors=True` allows the agent to try and correct its output if it makes a mistake.
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
# 6. Invoke the agent with questions
print("\n--- Question 1: Current Event ---")
question_1 = "What is the current capital of Australia?"
print(f"User: {question_1}")
response_1 = agent_executor.invoke({"input": question_1})
print(f"\nAgent's Final Answer: {response_1['output']}")
print("\n--- Question 2: General Knowledge (should not need tool) ---")
question_2 = "What is the chemical symbol for water?"
print(f"User: {question_2}")
response_2 = agent_executor.invoke({"input": question_2})
print(f"\nAgent's Final Answer: {response_2['output']}")
if __name__ == "__main__":
run_web_search_agent()
To Run:
- Save the code as
web_search_agent.py. - Ensure prerequisites are installed (
langchain-community,duckduckgo-search). - Make sure your
.envfile is set up (if using OpenAI) or Ollama is running. - Execute:
python web_search_agent.py
Expected Output (showing agent’s thought process for Question 1):
--- Running Web Search Agent ---
--- Question 1: Current Event ---
User: What is the current capital of Australia?
> Entering new AgentExecutor chain...
Thought: I need to find the current capital of Australia. This is a factual question that might require up-to-date information, so I should use a search tool.
Action: duckduckgo_search
Action Input: current capital of Australia
Observation: The capital of Australia is Canberra.
Thought: I now know the final answer.
Final Answer: The current capital of Australia is Canberra.
Agent's Final Answer: The current capital of Australia is Canberra.
--- Question 2: General Knowledge (should not need tool) ---
User: What is the chemical symbol for water?
> Entering new AgentExecutor chain...
Thought: The user is asking for the chemical symbol for water. This is a common knowledge question that I should be able to answer directly without needing to use any tools.
Final Answer: The chemical symbol for water is H2O.
Agent's Final Answer: The chemical symbol for water is H2O.
Key Takeaway: The AgentExecutor brings the LLM and tools to life. The verbose=True output clearly shows the LLM’s internal “Thought” process, its “Action” (tool call), the “Action Input,” and the “Observation” (tool’s output), which is then used to form the “Final Answer.” The agent intelligently decides when to use a tool and when to answer from its internal knowledge.
2.4. Hands-on: Creating and Using Custom Tools (Simple Calculator)
While built-in tools are handy, the real power comes from giving your agents access to your custom functions and APIs.
Scenario: Create an agent that can perform basic arithmetic using custom add and multiply tools.
custom_calculator_agent.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain.tools import tool # Decorator to easily create tools
from langchain.agents import AgentExecutor, create_react_agent
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# Load environment variables
load_dotenv()
# 1. Define custom tools using the @tool decorator
@tool
def add(a: int, b: int) -> int:
"""Adds two integers together and returns their sum."""
print(f"\n--- Tool Call: add({a}, {b}) ---")
return a + b
@tool
def multiply(a: int, b: int) -> int:
"""Multiplies two integers together and returns their product."""
print(f"\n--- Tool Call: multiply({a}, {b}) ---")
return a * b
def run_custom_calculator_agent():
print("--- Running Custom Calculator Agent ---")
# Initialize LLM
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
# 2. Bundle the custom tools
tools = [add, multiply]
# 3. Get the ReAct prompt
prompt = hub.pull("hwchase17/react")
# 4. Create the agent
agent = create_react_agent(llm, tools, prompt)
# 5. Create the Agent Executor
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
# 6. Invoke the agent with mathematical questions
print("\n--- Question 1: Simple Addition ---")
question_1 = "What is 1500 plus 2345?"
print(f"User: {question_1}")
response_1 = agent_executor.invoke({"input": question_1})
print(f"\nAgent's Final Answer: {response_1['output']}")
print("\n--- Question 2: Multi-step calculation ---")
question_2 = "Calculate (75 multiplied by 12) then add 50."
print(f"User: {question_2}")
response_2 = agent_executor.invoke({"input": question_2})
print(f"\nAgent's Final Answer: {response_2['output']}")
print("\n--- Question 3: Non-math question (should not use tool) ---")
question_3 = "What color is the sky on a clear day?"
print(f"User: {question_3}")
response_3 = agent_executor.invoke({"input": question_3})
print(f"\nAgent's Final Answer: {response_3['output']}")
if __name__ == "__main__":
run_custom_calculator_agent()
To Run:
- Save the code as
custom_calculator_agent.py. - Execute:
python custom_calculator_agent.py
Expected Output (showing tool calls):
--- Running Custom Calculator Agent ---
--- Question 1: Simple Addition ---
User: What is 1500 plus 2345?
> Entering new AgentExecutor chain...
Thought: The user is asking to add two numbers. I have an `add` tool that can perform this operation.
Action: add
Action Input: {"a": 1500, "b": 2345}
--- Tool Call: add(1500, 2345) ---
Observation: 3845
Thought: I have successfully added the numbers. I now know the final answer.
Final Answer: 3845
Agent's Final Answer: 3845
--- Question 2: Multi-step calculation ---
User: Calculate (75 multiplied by 12) then add 50.
> Entering new AgentExecutor chain...
Thought: The user wants me to perform two mathematical operations: first multiplication, then addition. I should start with multiplication using the `multiply` tool.
Action: multiply
Action Input: {"a": 75, "b": 12}
--- Tool Call: multiply(75, 12) ---
Observation: 900
Thought: I have multiplied 75 by 12, which resulted in 900. Now I need to add 50 to this result using the `add` tool.
Action: add
Action Input: {"a": 900, "b": 50}
--- Tool Call: add(900, 50) ---
Observation: 950
Thought: I have successfully performed both operations. I now know the final answer.
Final Answer: 950
Agent's Final Answer: 950
--- Question 3: Non-math question (should not use tool) ---
User: What color is the sky on a clear day?
> Entering new AgentExecutor chain...
Thought: The user is asking a general knowledge question about the color of the sky. This does not require any of my math tools. I can answer this directly.
Final Answer: On a clear day, the sky is typically blue.
Agent's Final Answer: On a clear day, the sky is typically blue.
Key Takeaway: The @tool decorator simplifies tool creation. The LLM, guided by the ReAct prompt, intelligently chained multiple tool calls (multiply then add) to solve a multi-step problem. The agent’s ability to reason about tool selection is evident.
2.5. Hands-on: Adding Conversational Memory to Your Agent
For an agent to participate in a coherent conversation, it needs memory to remember previous interactions.
Scenario: Build a conversational agent that remembers the user’s name and previous questions, and can still use tools for current information.
conversational_agent_with_memory.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
from langchain.agents import AgentExecutor, create_react_agent
from langchain import hub
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import AIMessage, HumanMessage
# Load environment variables
load_dotenv()
def run_conversational_agent_with_memory():
print("--- Running Conversational Agent with Memory ---")
# 1. Initialize the LLM
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
# 2. Define tools (web search for current info)
search_tool = DuckDuckGoSearchRun(name="duckduckgo_search")
tools = [search_tool]
# 3. Get a conversational ReAct agent prompt from LangChain Hub
# This prompt is designed to handle chat history.
prompt = hub.pull("hwchase17/react-chat-json") # A prompt variant suitable for chat history
# 4. Create the agent
# We use `create_react_agent` with our conversational prompt.
agent = create_react_agent(llm, tools, prompt)
# 5. Create the Agent Executor
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
# 6. Initialize ConversationBufferMemory
# This memory stores raw conversational messages.
# `memory_key="chat_history"` tells the prompt where to find the history.
# `return_messages=True` ensures messages are returned as actual message objects.
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# 7. Wrap the agent executor with RunnableWithMessageHistory
# This runnable manages the memory for different chat sessions.
# It dynamically loads and saves chat history for a given 'session_id'.
conversational_agent = RunnableWithMessageHistory(
agent_executor,
lambda session_id: memory, # This function provides the memory for a given session ID
input_messages_key="input", # Key for current user input
history_messages_key="chat_history" # Key where agent expects history in prompt
)
# 8. Engage in a conversation
session_id = "user_session_abc123" # A unique ID for this conversation session
print("\n--- Turn 1 ---")
user_input_1 = "Hi, my name is Alex. What is the tallest building in the world currently?"
print(f"User: {user_input_1}")
response_1 = conversational_agent.invoke(
{"input": user_input_1},
config={"configurable": {"session_id": session_id}}
)
print(f"\nAgent's Final Answer: {response_1['output']}")
print("\n--- Turn 2 (Testing memory and tool use) ---")
user_input_2 = "Okay, and what was my name again? Also, what's the capital of Japan?"
print(f"User: {user_input_2}")
response_2 = conversational_agent.invoke(
{"input": user_input_2},
config={"configurable": {"session_id": session_id}}
)
print(f"\nAgent's Final Answer: {response_2['output']}")
print("\n--- Turn 3 (Memory check, no tool needed) ---")
user_input_3 = "Thanks! Could you remind me what the tallest building is?"
print(f"User: {user_input_3}")
response_3 = conversational_agent.invoke(
{"input": user_input_3},
config={"configurable": {"session_id": session_id}}
)
print(f"\nAgent's Final Answer: {response_3['output']}")
if __name__ == "__main__":
run_conversational_agent_with_memory()
To Run:
- Save as
conversational_agent_with_memory.py. - Execute:
python conversational_agent_with_memory.py
Expected Output (showing both memory recall and tool use):
--- Running Conversational Agent with Memory ---
--- Turn 1 ---
User: Hi, my name is Alex. What is the tallest building in the world currently?
> Entering new AgentExecutor chain...
Thought: The user is asking about the tallest building in the world. This is a factual question that requires current information, so I should use a search tool.
Action: duckduckgo_search
Action Input: tallest building in the world currently
Observation: The tallest building in the world is the Burj Khalifa, located in Dubai, United Arab Emirates. It stands at a height of 828 meters (2,717 feet).
Thought: I have found the tallest building in the world. I should now answer the question.
Final Answer: The tallest building in the world currently is the Burj Khalifa, located in Dubai, United Arab Emirates.
Agent's Final Answer: The tallest building in the world currently is the Burj Khalifa, located in Dubai, United Arab Emirates.
--- Turn 2 (Testing memory and tool use) ---
User: Okay, and what was my name again? Also, what's the capital of Japan?
> Entering new AgentExecutor chain...
Thought: The user is asking two questions: one about their name and another about the capital of Japan. I can answer the name question from memory. For the capital of Japan, I can answer directly from my knowledge base.
Final Answer: Your name is Alex. The capital of Japan is Tokyo.
Agent's Final Answer: Your name is Alex. The capital of Japan is Tokyo.
--- Turn 3 (Memory check, no tool needed) ---
User: Thanks! Could you remind me what the tallest building is?
> Entering new AgentExecutor chain...
Thought: The user is asking to be reminded of the tallest building. I have already provided this information in our previous conversation, so I can retrieve it from memory and provide the answer directly without needing to use a tool again.
Final Answer: You previously asked about the tallest building in the world, and I told you it is the Burj Khalifa, located in Dubai, United Arab Emirates.
Agent's Final Answer: You previously asked about the tallest building in the world, and I told you it is the Burj Khalifa, located in Dubai, United Arab Emirates.
Key Takeaway: ConversationBufferMemory combined with RunnableWithMessageHistory allows the agent to maintain context over multiple turns. The agent intelligently uses its memory for personal information and general knowledge, while still using tools for current events. The chat_history is passed into the prompt, enabling the LLM to access the full conversation context.
3. Advanced Agentic Patterns and LangChain Techniques
Moving beyond basic agents, we explore patterns and techniques that make agents more intelligent, robust, and capable of solving complex, multi-step problems.
3.1. Understanding the ReAct Pattern
Theory: The ReAct (Reasoning and Acting) pattern is a powerful paradigm where the LLM interleaves natural language “Thoughts” (reasoning) with “Actions” (tool use). This cycle enables:
- Decomposition: Breaking down complex problems into smaller steps.
- Dynamic Planning: Adapting the plan based on tool outputs.
- Self-Correction: Using observations to fix errors.
- Transparency: Making the agent’s decision-making process visible.
The hwchase17/react prompt used earlier explicitly guides the LLM to follow this Thought-Action-Observation loop.
3.2. Hands-on: Customizing Agent Prompts for Specific Behavior
The prompt is the most direct way to control your agent’s persona, behavior, and output format.
Scenario: Create an agent that acts as a “Marketing Idea Generator.” It should always use a web search tool for inspiration and provide its final answer as a creative marketing campaign idea, including a slogan and target audience.
custom_prompt_agent.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import SystemMessage
# Load environment variables
load_dotenv()
def run_custom_prompt_agent():
print("--- Running Custom Prompt Marketing Agent ---")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7) # Slightly higher temp for creativity
# llm = ChatOllama(model="llama3", temperature=0.7) # For Ollama
search_tool = DuckDuckGoSearchRun(name="web_search")
tools = [search_tool]
# 1. Define a CUSTOM Prompt Template
# We combine a SystemMessage with the standard ReAct components.
custom_system_message = SystemMessage(
content=(
"You are an expert Marketing Idea Generator. Your goal is to brainstorm innovative marketing campaigns."
"Always use the 'web_search' tool to gather inspiration or current trends related to the product/service."
"When providing a final answer, it MUST be formatted as a creative marketing campaign idea, including:"
"1. Campaign Name\n2. Slogan\n3. Target Audience\n4. Key Concept (at least 3 sentences)"
"Be creative and concise."
)
)
# This combines our custom system message with the standard agent scratchpad and human input.
# The `MessagesPlaceholder` for `agent_scratchpad` is where the ReAct loop unfolds.
prompt = ChatPromptTemplate.from_messages([
custom_system_message,
("user", "{input}"),
MessagesPlaceholder("agent_scratchpad") # This is crucial for ReAct loop
])
# 2. Create the agent with the custom prompt
agent = create_react_agent(llm, tools, prompt)
# 3. Create the Agent Executor
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
# 4. Invoke the agent
print("\n--- Generating Marketing Idea ---")
product_idea = "A new line of eco-friendly, compostable phone cases."
print(f"User: Generate a marketing campaign for: {product_idea}")
response = agent_executor.invoke({"input": f"Generate a marketing campaign for: {product_idea}"})
print(f"\nAgent's Final Answer:\n{response['output']}")
if __name__ == "__main__":
run_custom_prompt_agent()
To Run:
- Save as
custom_prompt_agent.py. - Execute:
python custom_prompt_agent.py
Expected Output (notice the required output format):
--- Running Custom Prompt Marketing Agent ---
--- Generating Marketing Idea ---
User: Generate a marketing campaign for: A new line of eco-friendly, compostable phone cases.
> Entering new AgentExecutor chain...
Thought: The user wants a marketing campaign for eco-friendly, compostable phone cases. I need to gather inspiration and current trends related to eco-friendly products and phone cases using the web_search tool.
Action: web_search
Action Input: marketing trends eco-friendly compostable phone cases
Observation: ... (search results about sustainable tech, eco-friendly branding, etc.) ...
Thought: I have gathered some information on marketing trends for eco-friendly products. Now I need to brainstorm a creative campaign name, slogan, target audience, and key concept based on this and my knowledge.
Final Answer:
Campaign Name: EarthGuard Cases
Slogan: Protect Your Phone, Preserve Our Planet.
Target Audience: Environmentally-conscious consumers aged 18-45, tech enthusiasts looking for sustainable alternatives, individuals who value product lifecycle and brand ethics.
Key Concept: The campaign will emphasize the dual protection offered: superior safeguarding for their valuable smartphone and a tangible contribution to environmental preservation. We will highlight the innovative material science behind the compostable design, showing how the product seamlessly integrates into a sustainable lifestyle without compromising on quality or style. This campaign will leverage digital channels, influencer partnerships focusing on sustainability, and visually compelling content demonstrating the product's full lifecycle from use to compost.
> Finished chain.
Agent's Final Answer:
Campaign Name: EarthGuard Cases
Slogan: Protect Your Phone, Preserve Our Planet.
Target Audience: Environmentally-conscious consumers aged 18-45, tech enthusiasts looking for sustainable alternatives, individuals who value product lifecycle and brand ethics.
Key Concept: The campaign will emphasize the dual protection offered: superior safeguarding for their valuable smartphone and a tangible contribution to environmental preservation. We will highlight the innovative material science behind the compostable design, showing how the product seamlessly integrates into a sustainable lifestyle without compromising on quality or style. This campaign will leverage digital channels, influencer partnerships focusing on sustainability, and visually compelling content demonstrating the product's full lifecycle from use to compost.
Key Takeaway: By crafting a detailed SystemMessage, we effectively “programmed” the agent’s persona, forced it to use a specific tool, and dictated the exact format of its final output. This demonstrates the immense control prompt engineering provides.
3.3. Hands-on: Robust Custom Tools with Pydantic Inputs and Async Execution
For more complex tools, especially those interacting with APIs, structured input (using Pydantic) and asynchronous execution are critical for robustness and performance.
Scenario: Create an agent that can send a “simulated” email using a tool that requires structured input (recipient, subject, body) and is implemented asynchronously to mimic a network call.
advanced_custom_tools_agent.py
import os
import asyncio
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain.tools import tool
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, SystemMessage
from pydantic import BaseModel, Field # For structured tool inputs
# Load environment variables
load_dotenv()
# 1. Define a Pydantic model for structured tool input
class SendEmailInput(BaseModel):
recipient_email: str = Field(description="The email address of the recipient.")
subject: str = Field(description="The subject line of the email.")
body_content: str = Field(description="The main content of the email.")
# 2. Define an ASYNCHRONOUS tool with Pydantic input
@tool("send_email_async", args_schema=SendEmailInput)
async def send_email_tool_async(
recipient_email: str,
subject: str,
body_content: str
) -> str:
"""
Sends a simulated email to a specified recipient with a given subject and body.
This tool is asynchronous to simulate network latency.
"""
print(f"\n--- Async Tool Call: send_email_tool_async ---")
print(f" Attempting to send email to: {recipient_email}")
print(f" Subject: {subject}")
print(f" Body (first 50 chars): {body_content[:50]}...")
await asyncio.sleep(2) # Simulate network delay
if "@example.com" in recipient_email:
return f"Simulated email sent successfully to {recipient_email} with subject '{subject}'."
else:
# Simulate a validation error
return f"Error: Invalid recipient email domain for {recipient_email}. Only @example.com allowed for simulation."
def run_advanced_custom_tools_agent():
print("--- Running Agent with Advanced Custom Tools ---")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
tools = [send_email_tool_async]
# Custom prompt to guide the agent to use the email tool and extract parameters
custom_system_message = SystemMessage(
content=(
"You are an email automation assistant. Your primary task is to send emails using the 'send_email_async' tool."
"Carefully extract the recipient's email, subject, and body from the user's request."
"If any information is missing, ask for clarification."
"If the email sending fails, report the error to the user."
"Always respond with a concise confirmation or error message."
)
)
prompt = ChatPromptTemplate.from_messages([
custom_system_message,
("user", "{input}"),
MessagesPlaceholder("agent_scratchpad")
])
agent = create_react_agent(llm, tools, prompt)
# Note: For async agents, you typically use `ainvoke`
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
async def interact_with_agent():
print("\n--- Interaction 1: Valid Email Request ---")
user_input_1 = "Please send an email to alice@example.com. The subject should be 'Project Status Update'. The body content is 'Hi Alice, the project is on track for Q4. Best, Bob.'"
print(f"User: {user_input_1}")
response_1 = await agent_executor.ainvoke({"input": user_input_1})
print(f"\nAgent's Final Answer: {response_1['output']}")
print("\n--- Interaction 2: Invalid Email Domain ---")
user_input_2 = "Can you send a quick note to charlie@bad-domain.net with subject 'Quick Chat' and body 'When are you free?'"
print(f"User: {user_input_2}")
response_2 = await agent_executor.ainvoke({"input": user_input_2})
print(f"\nAgent's Final Answer: {response_2['output']}")
print("\n--- Interaction 3: Missing Information ---")
user_input_3 = "Send an email to sarah@example.com about a meeting."
print(f"User: {user_input_3}")
response_3 = await agent_executor.ainvoke({"input": user_input_3})
print(f"\nAgent's Final Answer: {response_3['output']}")
# Run the asynchronous interactions
asyncio.run(interact_with_agent())
if __name__ == "__main__":
run_advanced_custom_tools_agent()
To Run:
- Save as
advanced_custom_tools_agent.py. - Execute:
python advanced_custom_tools_agent.py
Expected Output (demonstrating structured parsing, async delay, and error handling):
--- Running Agent with Advanced Custom Tools ---
--- Interaction 1: Valid Email Request ---
User: Please send an email to alice@example.com. The subject should be 'Project Status Update'. The body content is 'Hi Alice, the project is on track for Q4. Best, Bob.'
> Entering new AgentExecutor chain...
Thought: The user wants to send an email. I have a `send_email_async` tool that can do this. I need to extract the `recipient_email`, `subject`, and `body_content` from the user's request.
Action: send_email_async
Action Input: {"recipient_email": "alice@example.com", "subject": "Project Status Update", "body_content": "Hi Alice, the project is on track for Q4. Best, Bob."}
--- Async Tool Call: send_email_tool_async ---
Attempting to send email to: alice@example.com
Subject: Project Status Update
Body (first 50 chars): Hi Alice, the project is on track for Q4. Best, Bob.
Observation: Simulated email sent successfully to alice@example.com with subject 'Project Status Update'.
Thought: I have successfully sent the email.
Final Answer: Email sent to alice@example.com successfully.
Agent's Final Answer: Email sent to alice@example.com successfully.
--- Interaction 2: Invalid Email Domain ---
User: Can you send a quick note to charlie@bad-domain.net with subject 'Quick Chat' and body 'When are you free?'
> Entering new AgentExecutor chain...
Thought: The user wants to send an email. I have the `send_email_async` tool for this. I need to extract the recipient email, subject, and body content from the request.
Action: send_email_async
Action Input: {"recipient_email": "charlie@bad-domain.net", "subject": "Quick Chat", "body_content": "When are you free?"}
--- Async Tool Call: send_email_tool_async ---
Attempting to send email to: charlie@bad-domain.net
Subject: Quick Chat
Body (first 50 chars): When are you free?
Observation: Error: Invalid recipient email domain for charlie@bad-domain.net. Only @example.com allowed for simulation.
Thought: The `send_email_async` tool reported an error: "Invalid recipient email domain". I need to inform the user about this error.
Final Answer: I couldn't send the email because of an invalid recipient email domain for charlie@bad-domain.net. Only @example.com is allowed for this simulated email service.
Agent's Final Answer: I couldn't send the email because of an invalid recipient email domain for charlie@bad-domain.net. Only @example.com is allowed for this simulated email service.
--- Interaction 3: Missing Information ---
User: Send an email to sarah@example.com about a meeting.
> Entering new AgentExecutor chain...
Thought: The user wants to send an email to sarah@example.com about a meeting. I have the recipient email and a general topic for the subject, but the body content is missing. I should ask the user for the body content before attempting to send the email.
Final Answer: I can send an email to sarah@example.com with the subject "Meeting". What would you like the body of the email to say?
Agent's Final Answer: I can send an email to sarah@example.com with the subject "Meeting". What would you like the body of the email to say?
Key Takeaway:
- Pydantic (
args_schema): Allows you to define structured, type-hinted inputs for your tools. The LLM then becomes responsible for parsing natural language into this structure. - Asynchronous Tools (
async def,await,ainvoke): Crucial for tools that involve I/O-bound operations (network requests, disk access). This prevents your agent from blocking and allows for more responsive applications. - Tool-Level Error Handling: Tools should return clear error messages. The agent’s LLM can then interpret these observations and respond appropriately to the user (e.g., in Interaction 2).
- Prompt for Clarification: The agent can be prompted to ask for missing information, making it more robust (Interaction 3).
3.4. Hands-on: Implementing Self-Correction and Error Handling
Even the best LLMs can sometimes make mistakes in tool calls (e.g., misformatting arguments). LangChain’s AgentExecutor can help with basic self-correction.
Scenario: Demonstrate how handle_parsing_errors=True helps an agent recover from an LLM-induced parsing error.
self_correction_agent.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain.tools import tool
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, SystemMessage
# Load environment variables
load_dotenv()
@tool
def process_data_entry(entry_id: str, value: int) -> str:
"""Processes a data entry with a specific ID and an integer value.
Returns success message or error if value is negative."""
print(f"\n--- Tool Call: process_data_entry({entry_id}, {value}) ---")
if value < 0:
return f"Error: Value for entry '{entry_id}' cannot be negative."
return f"Successfully processed entry '{entry_id}' with value {value}."
def run_self_correction_agent():
print("--- Running Self-Correction Agent ---")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
tools = [process_data_entry]
custom_system_message = SystemMessage(
content=(
"You are a data processing agent. Use the 'process_data_entry' tool to process user requests."
"The 'value' parameter for the tool MUST be an integer."
"If you make a mistake in calling the tool, analyze the error and try to correct it."
"Always report the outcome to the user."
)
)
prompt = ChatPromptTemplate.from_messages([
custom_system_message,
("user", "{input}"),
MessagesPlaceholder("agent_scratchpad")
])
agent = create_react_agent(llm, tools, prompt)
# Key for self-correction: handle_parsing_errors=True
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True # This enables the agent to try and recover
)
print("\n--- Interaction 1: Deliberate LLM Parsing Error (value as string) ---")
user_input_1 = "Process data entry 'X123' with value 'ten'." # 'ten' is not an integer
print(f"User: {user_input_1}")
response_1 = agent_executor.invoke({"input": user_input_1})
print(f"\nAgent's Final Answer: {response_1['output']}")
print("\n--- Interaction 2: Valid Call ---")
user_input_2 = "Process data entry 'Y456' with value 100."
print(f"User: {user_input_2}")
response_2 = agent_executor.invoke({"input": user_input_2})
print(f"\nAgent's Final Answer: {response_2['output']}")
if __name__ == "__main__":
run_self_correction_agent()
To Run:
- Save as
self_correction_agent.py. - Execute:
python self_correction_agent.py
Expected Output (demonstrating parsing error and recovery):
--- Running Self-Correction Agent ---
--- Interaction 1: Deliberate LLM Parsing Error (value as string) ---
User: Process data entry 'X123' with value 'ten'.
> Entering new AgentExecutor chain...
Thought: The user wants to process a data entry with an ID and a value. I should use the `process_data_entry` tool. The value needs to be an integer.
Action: process_data_entry
Action Input: {"entry_id": "X123", "value": "ten"} # LLM makes a mistake here, passing 'ten' as string
--- Tool Call: process_data_entry(X123, ten) ---
Observation: Invalid Tool Input: got `ValidationError(model='process_data_entry', errors=[{'type': 'int_parsing', 'loc': ('value',), 'msg': 'Input should be a valid integer, unable to parse string as an integer', 'input': 'ten'}])`
Thought: The tool call failed because the `value` was not a valid integer. I need to correct this and provide a valid integer for the `value` parameter. The user specified "ten", which means the integer 10.
Action: process_data_entry
Action Input: {"entry_id": "X123", "value": 10} # Agent corrects the input to an integer
--- Tool Call: process_data_entry(X123, 10) ---
Observation: Successfully processed entry 'X123' with value 10.
Thought: I have successfully corrected the input and processed the data entry.
Final Answer: Successfully processed data entry 'X123' with value 10.
Agent's Final Answer: Successfully processed data entry 'X123' with value 10.
--- Interaction 2: Valid Call ---
User: Process data entry 'Y456' with value 100.
> Entering new AgentExecutor chain...
Thought: The user wants to process a data entry 'Y456' with a value of 100. I should use the `process_data_entry` tool. The value is already an integer.
Action: process_data_entry
Action Input: {"entry_id": "Y456", "value": 100}
--- Tool Call: process_data_entry(Y456, 100) ---
Observation: Successfully processed entry 'Y456' with value 100.
Thought: I have successfully processed the data entry.
Final Answer: Successfully processed data entry 'Y456' with value 100.
Agent's Final Answer: Successfully processed data entry 'Y456' with value 100.
Key Takeaway: The handle_parsing_errors=True parameter is powerful. When the LLM makes a mistake in calling a tool (e.g., providing a string where an integer is expected), the AgentExecutor catches the error. It then feeds this error message back to the LLM as an “Observation,” allowing the LLM to understand its mistake and generate a corrected tool call in the next iteration. This enhances agent robustness.
4. Stateful Orchestration with LangGraph: Beyond Linear Chains
LangChain’s AgentExecutor is great for agents following a generally linear or simple iterative loop. However, many real-world applications require more complex decision-making, conditional branching, explicit loops, and multi-agent coordination. This is where LangGraph becomes essential.
4.1. Why LangGraph? (Limitations of Linear Agents)
Traditional LangChain chains or even AgentExecutor (which is itself a type of chain) struggle with:
- Complex Conditional Logic: “If X happens, do A; if Y happens, do B; otherwise, do C.”
- Explicit Cycles/Loops: Revisit a previous step based on a condition (e.g., “retry this step 3 times if it fails”).
- Parallel Execution: Performing multiple tasks simultaneously and combining their results.
- Human-in-the-Loop: Pausing the automation for human review and then resuming based on feedback.
- Multi-Agent Coordination: Orchestrating complex interactions between different specialized agents.
LangGraph solves these by treating your application as a stateful, cyclic graph.
4.2. Core Concepts: Graph State, Nodes, and Edges
- Graph State: The central concept. This is a mutable object (often a
TypedDict) that holds all the relevant information for your application. Each “node” in the graph receives the current state, performs its computation, and returns updates to the state. LangGraph automatically merges these updates. - Nodes: These are the computational units of your graph. A node can be:
- An LLM call.
- A tool invocation.
- A custom Python function (which can contain any logic).
- Another pre-built LangChain Runnable or Agent. Nodes take the current state and return state updates.
- Edges: Define the transitions between nodes.
- Normal Edges: Unconditional transitions (Node A -> Node B).
- Conditional Edges: Dynamic transitions based on the output of a node. A special function determines the next node based on the current state.
- Entry Point & End Point: Define where the graph starts and where it can terminate.
4.3. Hands-on: Building a Conditional Search Agent with LangGraph
Let’s build a LangGraph agent that can decide whether a user’s question requires a web search, or if it can answer directly. If it needs a search, it uses the tool, then processes the result. Otherwise, it answers.
Scenario: An agent that intelligently uses a search tool only when necessary, demonstrating conditional logic.
langgraph_conditional_agent.py
import os
from dotenv import load_dotenv
from typing import TypedDict, Annotated, List
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
from langgraph.graph import StateGraph, END
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.utils.function_calling import format_tool_to_openai_function
# Load environment variables
load_dotenv()
# --- 1. Define the Graph State ---
# This TypedDict defines the structure of the data that flows through our graph.
# `Annotated` with a lambda function provides a way to define how lists (like chat_history) are merged.
class AgentState(TypedDict):
chat_history: Annotated[List[BaseMessage], lambda x, y: x + y] # Accumulates messages
# user_input: str # Not strictly needed if input is always in chat_history
# --- 2. Initialize LLM and Tools ---
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
search_tool = DuckDuckGoSearchRun(name="web_search")
tools = [search_tool]
# LangGraph with OpenAI's function calling models needs tools in a specific format
llm_with_tools = llm.bind_functions([format_tool_to_openai_function(t) for t in tools])
# --- 3. Define the Nodes for the Graph ---
def call_llm(state: AgentState):
"""Node that calls the LLM for decision-making or final answer."""
messages = state["chat_history"]
print(f"\n--- LLM Node: Input Messages ---")
for msg in messages:
print(f" {type(msg).__name__}: {msg.content}")
response = llm_with_tools.invoke(messages)
return {"chat_history": [response]} # Add LLM's response to history
def call_tool(state: AgentState):
"""Node that executes a tool based on the LLM's decision."""
last_message = state["chat_history"][-1]
print(f"\n--- Tool Node: Last LLM Message ---")
print(f" {type(last_message).__name__}: {last_message.content}")
if not last_message.tool_calls:
raise ValueError("LLM did not specify a tool to call.")
# Execute the first tool call suggested by the LLM
tool_call = last_message.tool_calls[0]
tool_name = tool_call["name"]
tool_args = tool_call["args"]
if tool_name == search_tool.name:
print(f" Executing tool: {tool_name} with args: {tool_args}")
tool_result = search_tool.invoke(tool_args["query"])
else:
raise ValueError(f"Unknown tool: {tool_name}")
# Add the tool's output back to the chat history as a ToolMessage
return {"chat_history": [ToolMessage(content=tool_result, tool_call_id=tool_call["id"])]}
# --- 4. Define the Conditional Edge Logic ---
def should_continue(state: AgentState) -> str:
"""
Decides whether to continue to a tool call or end the graph.
Based on if the last LLM message has 'tool_calls'.
"""
last_message = state["chat_history"][-1]
if last_message.tool_calls:
print("\n--- Conditional Edge: LLM suggested tool call. Routing to TOOL node. ---")
return "continue_tool"
else:
print("\n--- Conditional Edge: LLM provided final answer. Routing to END. ---")
return "end_response"
def run_langgraph_conditional_agent():
print("--- Running LangGraph Conditional Agent ---")
# --- 5. Build the LangGraph Workflow ---
workflow = StateGraph(AgentState)
# Add nodes to the workflow
workflow.add_node("llm", call_llm) # Node for LLM interaction
workflow.add_node("tool", call_tool) # Node for tool execution
# Set the entry point for the graph
workflow.set_entry_point("llm")
# Add conditional edges from the 'llm' node
# The 'should_continue' function determines the next step.
workflow.add_conditional_edges(
"llm", # Source node
should_continue, # Function to determine which branch to take
{
"continue_tool": "tool", # If LLM suggests a tool, go to 'tool' node
"end_response": END # If LLM gives final answer, end the graph
}
)
# After the tool is called, always go back to the LLM
# The LLM then processes the tool's output and decides the next step (e.g., final answer or another tool)
workflow.add_edge("tool", "llm")
# Compile the graph
app = workflow.compile()
# --- 6. Invoke the Graph ---
print("\n\n=== Interaction 1: Requires Web Search ===")
user_query_1 = "What is the capital of Canada?"
print(f"User: {user_query_1}")
inputs_1 = {"chat_history": [HumanMessage(content=user_query_1)]}
for s in app.stream(inputs_1):
if "__end__" not in s:
print(s)
print("\nFinal Result of Interaction 1:")
print(app.invoke(inputs_1)["chat_history"][-1].content)
print("\n\n=== Interaction 2: Direct LLM Response ===")
user_query_2 = "Tell me a fun fact about giraffes."
print(f"User: {user_query_2}")
inputs_2 = {"chat_history": [HumanMessage(content=user_query_2)]}
for s in app.stream(inputs_2):
if "__end__" not in s:
print(s)
print("\nFinal Result of Interaction 2:")
print(app.invoke(inputs_2)["chat_history"][-1].content)
if __name__ == "__main__":
run_langgraph_conditional_agent()
To Run:
- Save as
langgraph_conditional_agent.py. - Execute:
python langgraph_conditional_agent.py
Expected Output (highlights nodes and conditional routing):
--- Running LangGraph Conditional Agent ---
=== Interaction 1: Requires Web Search ===
User: What is the capital of Canada?
--- LLM Node: Input Messages ---
HumanMessage: What is the capital of Canada?
{'llm': AIMessage(content='', additional_kwargs={'function_call': {'name': 'web_search', 'arguments': '{"query":"capital of Canada"}'}})}
--- Conditional Edge: LLM suggested tool call. Routing to TOOL node. ---
{'tool': ToolMessage(content='The capital of Canada is Ottawa.', tool_call_id='call_j2r3k4l5')}
--- LLM Node: Input Messages ---
HumanMessage: What is the capital of Canada?
AIMessage:
ToolMessage: The capital of Canada is Ottawa.
{'llm': AIMessage(content='The capital of Canada is Ottawa.')}
--- Conditional Edge: LLM provided final answer. Routing to END. ---
Final Result of Interaction 1:
The capital of Canada is Ottawa.
=== Interaction 2: Direct LLM Response ===
User: Tell me a fun fact about giraffes.
--- LLM Node: Input Messages ---
HumanMessage: Tell me a fun fact about giraffes.
{'llm': AIMessage(content='Giraffes only need 5 to 30 minutes of sleep in a 24-hour period! They often achieve this in short bursts of 1 to 2 minutes at a time.')}
--- Conditional Edge: LLM provided final answer. Routing to END. ---
Final Result of Interaction 2:
Giraffes only need 5 to 30 minutes of sleep in a 24-hour period! They often achieve this in short bursts of 1 to 2 minutes at a time.
Key Takeaway:
StateGraphandAgentState: The core structure, defining how data flows and is updated.- Nodes (
call_llm,call_tool): Encapsulate distinct processing steps. llm.bind_functions: Essential for making OpenAI-compatible LLMs aware of tools for function calling.should_continue: The conditional edge function demonstrates dynamic routing based on the LLM’s output (tool_callsin this case).- Cycles: The
workflow.add_edge("tool", "llm")creates a loop, allowing the LLM to process tool results and decide the next step (potentially another tool or a final answer). app.stream(): Useful for observing the graph’s execution step by step.
4.4. Hands-on: Implementing Human-in-the-Loop with LangGraph
LangGraph’s explicit graph structure makes it ideal for integrating human intervention at specific points in a workflow.
Scenario: An agent that proposes a task, but requires human approval before executing a potentially impactful tool (e.g., updating a database). If approved, it proceeds; otherwise, it asks for revised instructions.
langgraph_human_in_loop.py
import os
from dotenv import load_dotenv
from typing import TypedDict, Annotated, List
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
from langgraph.graph import StateGraph, END
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.utils.function_calling import format_tool_to_openai_function
# Load environment variables
load_dotenv()
class HumanInLoopState(TypedDict):
chat_history: Annotated[List[BaseMessage], lambda x, y: x + y]
proposed_action: str # Stores the action the LLM wants to take
human_approved: Annotated[bool | None, lambda x, y: y] # Flag for human approval, always take latest
# Initialize LLM and tools
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
# Mock a "critical" tool
@tool
def update_customer_record(customer_id: str, new_status: str) -> str:
"""Updates the status of a customer record in the CRM."""
print(f"\n[CRITICAL TOOL EXECUTION]: Updating customer {customer_id} to status: {new_status}")
return f"Customer {customer_id} status updated to '{new_status}'."
tools = [update_customer_record, DuckDuckGoSearchRun(name="web_search")]
llm_with_tools = llm.bind_functions([format_tool_to_openai_function(t) for t in tools])
# --- Nodes ---
def call_llm_for_decision(state: HumanInLoopState):
messages = state["chat_history"]
response = llm_with_tools.invoke(messages)
return {"chat_history": [response]}
def call_critical_tool(state: HumanInLoopState):
last_llm_message = state["chat_history"][-1]
tool_call = last_llm_message.tool_calls[0]
tool_name = tool_call["name"]
tool_args = tool_call["args"]
if tool_name == update_customer_record.name:
result = update_customer_record.invoke(tool_args)
return {"chat_history": [ToolMessage(content=result, tool_call_id=tool_call["id"])]}
else:
# Fallback for other tools if needed, but here only one critical
return {"chat_history": [ToolMessage(content="Error: Non-critical tool called in critical path.", tool_call_id=tool_call["id"])]}
def propose_action_for_human(state: HumanInLoopState):
"""LLM's last message is assumed to contain the proposed action (tool call)."""
last_llm_message = state["chat_history"][-1]
if last_llm_message.tool_calls:
proposed_action_str = f"Tool: {last_llm_message.tool_calls[0]['name']}\nArgs: {last_llm_message.tool_calls[0]['args']}"
print(f"\n--- HUMAN INTERVENTION REQUIRED ---")
print(f"Agent proposes: {proposed_action_str}")
print("Please approve or reject this action.")
return {"proposed_action": proposed_action_str}
return {"proposed_action": "No specific tool action proposed."} # Should not happen if routed correctly
# --- Conditional Edge Logic ---
def route_decision_or_human_review(state: HumanInLoopState) -> str:
last_llm_message = state["chat_history"][-1]
if last_llm_message.tool_calls:
# If the LLM wants to call the 'update_customer_record' tool, go to human review
if last_llm_message.tool_calls[0]["name"] == update_customer_record.name:
print("\n--- Routing: Critical tool detected. -> HUMAN_REVIEW node ---")
return "human_review"
else:
# Other tools can be executed directly (e.g., web_search)
print("\n--- Routing: Non-critical tool detected. -> TOOL_EXECUTION node ---")
return "tool_execution"
else:
print("\n--- Routing: LLM provided final answer. -> END ---")
return "end_response"
def route_after_human_review(state: HumanInLoopState) -> str:
if state.get("human_approved"):
print("\n--- Routing: Human Approved. -> TOOL_EXECUTION node ---")
return "execute_approved_action"
else:
print("\n--- Routing: Human Rejected. -> REVISION node ---")
return "revision_required"
def run_langgraph_human_in_loop():
print("--- Running LangGraph Human-in-the-Loop Agent ---")
workflow = StateGraph(HumanInLoopState)
workflow.add_node("llm_decision", call_llm_for_decision)
workflow.add_node("human_review", propose_action_for_human)
workflow.add_node("tool_execution", call_critical_tool) # This node now handles both direct and approved tool execution
# A revision node could simply be the LLM being prompted to re-evaluate
workflow.add_node("revision_node", call_llm_for_decision) # For simplicity, reuse LLM for revision
workflow.set_entry_point("llm_decision")
# After LLM decision, route to tool execution, human review, or end
workflow.add_conditional_edges(
"llm_decision",
route_decision_or_human_review,
{
"tool_execution": "tool_execution", # For non-critical tools (e.g., web search)
"human_review": "human_review", # For critical tools (update_customer_record)
"end_response": END
}
)
# After human review, route based on approval
workflow.add_conditional_edges(
"human_review",
route_after_human_review,
{
"execute_approved_action": "tool_execution", # If approved, execute the tool
"revision_required": "revision_node" # If rejected, go back to LLM for revision
}
)
# After tool execution, go back to LLM to process tool result
workflow.add_edge("tool_execution", "llm_decision")
# If revision is required, send the feedback to LLM
# Here, we'll manually append a HumanMessage indicating rejection
workflow.add_edge("revision_node", "llm_decision")
app = workflow.compile()
# --- Interaction 1: Critical Action Requiring Approval (Approved) ---
print("\n\n=== Interaction 1: Update Customer (Approved) ===")
user_query_1 = "Change customer CUST001's status to 'Premium'."
print(f"User: {user_query_1}")
inputs_1 = {"chat_history": [HumanMessage(content=user_query_1)]}
for step in app.stream(inputs_1):
if "human_review" in step:
# Simulate human input
print("\n(Simulating human approval... typing 'y')")
# This is where a UI/API would get input. For now, manual.
step["human_review"]["human_approved"] = True
print(step)
print("\nFinal Result of Interaction 1:")
print(app.invoke(inputs_1)["chat_history"][-1].content)
# --- Interaction 2: Critical Action Requiring Approval (Rejected) ---
print("\n\n=== Interaction 2: Update Customer (Rejected) ===")
user_query_2 = "Change customer CUST002's status to 'Deactivated'."
print(f"User: {user_query_2}")
inputs_2 = {"chat_history": [HumanMessage(content=user_query_2)]}
for step in app.stream(inputs_2):
if "human_review" in step:
# Simulate human input
print("\n(Simulating human rejection... typing 'n')")
step["human_review"]["human_approved"] = False
print(step)
print("\nFinal Result of Interaction 2:")
print(app.invoke(inputs_2)["chat_history"][-1].content) # This should show the LLM recognizing rejection.
# --- Interaction 3: Non-critical action (no approval) ---
print("\n\n=== Interaction 3: Non-critical Search ===")
user_query_3 = "What is the capital of France?"
print(f"User: {user_query_3}")
inputs_3 = {"chat_history": [HumanMessage(content=user_query_3)]}
for s in app.stream(inputs_3):
if "__end__" not in s:
print(s)
print("\nFinal Result of Interaction 3:")
print(app.invoke(inputs_3)["chat_history"][-1].content)
if __name__ == "__main__":
run_langgraph_human_in_loop()
To Run:
- Save as
langgraph_human_in_loop.py. - Execute:
python langgraph_human_in_loop.pyNote: For thehuman_reviewstep, the script simulates input. In a real application, you’d integrate this with a UI or an external system that gathers actual human approval.
Expected Output (highlighting the human review step and conditional branching):
--- Running LangGraph Human-in-the-Loop Agent ---
=== Interaction 1: Update Customer (Approved) ===
User: Change customer CUST001's status to 'Premium'.
{'llm_decision': AIMessage(content='', additional_kwargs={'function_call': {'name': 'update_customer_record', 'arguments': '{"customer_id":"CUST001","new_status":"Premium"}'}})}
--- Routing: Critical tool detected. -> HUMAN_REVIEW node ---
{'human_review': {'proposed_action': 'Tool: update_customer_record\nArgs: {\'customer_id\': \'CUST001\', \'new_status\': \'Premium\'}'}}
--- HUMAN INTERVENTION REQUIRED ---
Agent proposes: Tool: update_customer_record
Args: {'customer_id': 'CUST001', 'new_status': 'Premium'}
Please approve or reject this action.
(Simulating human approval... typing 'y')
--- Routing: Human Approved. -> TOOL_EXECUTION node ---
[CRITICAL TOOL EXECUTION]: Updating customer CUST001 to status: Premium
{'tool_execution': ToolMessage(content='Customer CUST001 status updated to \'Premium\'.', tool_call_id='call_w5x9y0z1')}
{'llm_decision': AIMessage(content='Customer CUST001 status has been successfully updated to Premium.')}
{'__end__': {'llm_decision': AIMessage(content='Customer CUST001 status has been successfully updated to Premium.')}}
Final Result of Interaction 1:
Customer CUST001 status has been successfully updated to Premium.
=== Interaction 2: Update Customer (Rejected) ===
User: Change customer CUST002's status to 'Deactivated'.
{'llm_decision': AIMessage(content='', additional_kwargs={'function_call': {'name': 'update_customer_record', 'arguments': '{"customer_id":"CUST002","new_status":"Deactivated"}'}})}
--- Routing: Critical tool detected. -> HUMAN_REVIEW node ---
{'human_review': {'proposed_action': 'Tool: update_customer_record\nArgs: {\'customer_id\': \'CUST002\', \'new_status\': \'Deactivated\'}'}}
--- HUMAN INTERVENTION REQUIRED ---
Agent proposes: Tool: update_customer_record
Args: {'customer_id': 'CUST002', 'new_status': 'Deactivated'}
Please approve or reject this action.
(Simulating human rejection... typing 'n')
--- Routing: Human Rejected. -> REVISION node ---
{'revision_node': AIMessage(content='', additional_kwargs={'tool_calls': []})} # LLM's response to the rejection in its context, likely "I cannot perform this action."
{'llm_decision': AIMessage(content='The proposed action to change customer CUST002\'s status to \'Deactivated\' was rejected. Please provide revised instructions.')}
{'__end__': {'llm_decision': AIMessage(content='The proposed action to change customer CUST002\'s status to \'Deactivated\' was rejected. Please provide revised instructions.')}}
Final Result of Interaction 2:
The proposed action to change customer CUST002's status to 'Deactivated' was rejected. Please provide revised instructions.
=== Interaction 3: Non-critical Search ===
User: What is the capital of France?
{'llm_decision': AIMessage(content='The capital of France is Paris.')}
--- Routing: LLM provided final answer. -> END ---
Final Result of Interaction 3:
The capital of France is Paris.
Key Takeaway:
- Custom State:
HumanInLoopStatetracks not only chat history but also theproposed_actionandhuman_approvedflag. - Conditional Edges for Control:
route_decision_or_human_reviewandroute_after_human_revieware crucial for dynamically changing the workflow based on the LLM’s intent and human feedback. - Separate Nodes for Critical Steps: A dedicated
human_reviewnode clearly delineates where human input is required. - Real-world Integration: In a production setting, the
human_reviewnode would typically involve sending a notification to a human, waiting for input via a UI or API, and then updating the graph state with the decision. Thestream()method helps simulate this. - Revision Loop: If rejected, the agent can loop back to the LLM (
revision_node->llm_decision) to process the rejection feedback and potentially plan an alternative.
4.5. Hands-on: Persisting Agent State for Long-Running Workflows
For complex, multi-step tasks or conversational agents that need to remember context across sessions, persisting the LangGraph state is crucial. LangGraph offers checkpointers for this.
Scenario: An agent that guides a user through a multi-step task (e.g., collecting information for a loan application). The user might leave and come back, and the agent should resume exactly where they left off.
langgraph_persistence.py
import os
from dotenv import load_dotenv
from typing import TypedDict, Annotated, List
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langgraph.graph import StateGraph, END
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.utils.function_calling import format_tool_to_openai_function
from langgraph.checkpoint.sqlite import SqliteSaver # For persisting state
# Load environment variables
load_dotenv()
class PersistentAgentState(TypedDict):
chat_history: Annotated[List[BaseMessage], lambda x, y: x + y]
# We could add more state variables relevant to the multi-step task, e.g.,
# loan_application_data: dict # To store collected info
# Initialize LLM (no tools needed for this simple memory demo)
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
llm_no_tools = llm # No tools bound for this specific example
def call_llm_for_conversation(state: PersistentAgentState):
"""Simple LLM node for conversational responses."""
messages = state["chat_history"]
response = llm_no_tools.invoke(messages)
return {"chat_history": [response]}
def run_langgraph_persistence():
print("--- Running LangGraph Persistent Agent ---")
# 1. Initialize the SqliteSaver checkpointer
# This will create 'langgraph_checkpoint.sqlite' file to store states.
# For a real app, you'd configure a proper database.
memory = SqliteSaver.from_conn_string("langgraph_checkpoint.sqlite")
print(f"Checkpoints will be saved/loaded from: langgraph_checkpoint.sqlite")
# 2. Build a simple conversational workflow
workflow = StateGraph(PersistentAgentState)
workflow.add_node("chatbot", call_llm_for_conversation)
workflow.set_entry_point("chatbot")
workflow.add_edge("chatbot", END) # Simple direct response and end
# 3. Compile the graph with the checkpointer
# `checkpointer` is the key for persistence.
app = workflow.compile(checkpointer=memory)
session_id_1 = "user_loan_session_123"
session_id_2 = "user_loan_session_456"
# --- Interaction 1: First Session ---
print(f"\n=== Session: {session_id_1} - Starting New Conversation ===")
user_query_1_1 = "Hi, I'm starting a loan application. What information do you need first?"
print(f"User [{session_id_1}]: {user_query_1_1}")
inputs_1_1 = {"chat_history": [HumanMessage(content=user_query_1_1)]}
response_1_1 = app.invoke(inputs_1_1, config={"configurable": {"thread_id": session_id_1}})
print(f"Agent [{session_id_1}]: {response_1_1['chat_history'][-1].content}")
user_query_1_2 = "My annual income is $75,000."
print(f"User [{session_id_1}]: {user_query_1_2}")
inputs_1_2 = {"chat_history": [HumanMessage(content=user_query_1_2)]}
response_1_2 = app.invoke(inputs_1_2, config={"configurable": {"thread_id": session_id_1}})
print(f"Agent [{session_id_1}]: {response_1_2['chat_history'][-1].content}")
print(f"\n--- User for {session_id_1} leaves and comes back later ---")
# --- Interaction 1 (continued): Resume Session ---
user_query_1_3 = "What was my income again?"
print(f"User [{session_id_1}]: {user_query_1_3}")
# LangGraph will automatically load the state for session_id_1
inputs_1_3 = {"chat_history": [HumanMessage(content=user_query_1_3)]}
response_1_3 = app.invoke(inputs_1_3, config={"configurable": {"thread_id": session_id_1}})
print(f"Agent [{session_id_1}]: {response_1_3['chat_history'][-1].content}")
# --- Interaction 2: A Completely Separate Session ---
print(f"\n=== Session: {session_id_2} - Starting New Conversation ===")
user_query_2_1 = "I need information on starting a new business. What are the first steps?"
print(f"User [{session_id_2}]: {user_query_2_1}")
inputs_2_1 = {"chat_history": [HumanMessage(content=user_query_2_1)]}
response_2_1 = app.invoke(inputs_2_1, config={"configurable": {"thread_id": session_id_2}})
print(f"Agent [{session_id_2}]: {response_2_1['chat_history'][-1].content}")
print("\n--- You can now check the 'langgraph_checkpoint.sqlite' file for saved states. ---")
print("You can rerun this script, and session_id_1 will still remember previous turns.")
if __name__ == "__main__":
# Clean up old checkpoint file for a fresh start each time (optional)
if os.path.exists("langgraph_checkpoint.sqlite"):
os.remove("langgraph_checkpoint.sqlite")
print("Removed existing langgraph_checkpoint.sqlite for a clean run.")
run_langgraph_persistence()
To Run:
- Save as
langgraph_persistence.py. - Execute:
python langgraph_persistence.py - Observe the
langgraph_checkpoint.sqlitefile created. You can use a SQLite browser to inspect its content (though it’s not directly human-readable, you’ll see thread IDs and states). - Run the script again. Notice how
session_id_1remembers the income from the previous run.
Expected Output (highlighting memory across invocations):
Removed existing langgraph_checkpoint.sqlite for a clean run.
--- Running LangGraph Persistent Agent ---
Checkpoints will be saved/loaded from: langgraph_checkpoint.sqlite
=== Session: user_loan_session_123 - Starting New Conversation ===
User [user_loan_session_123]: Hi, I'm starting a loan application. What information do you need first?
Agent [user_loan_session_123]: To start your loan application, I'll need some basic information. Could you please tell me your full name and your current annual income?
User [user_loan_session_123]: My annual income is $75,000.
Agent [user_loan_session_123]: Thank you for providing your annual income of $75,000. Next, could you please tell me your full name?
--- User for user_loan_session_123 leaves and comes back later ---
User [user_loan_session_123]: What was my income again?
Agent [user_loan_session_123]: You previously mentioned your annual income is $75,000.
=== Session: user_loan_session_456 - Starting New Conversation ===
User [user_loan_session_456]: I need information on starting a new business. What are the first steps?
Agent [user_loan_session_456]: Starting a new business involves several key steps. First, it's essential to define your business idea and create a detailed business plan. This includes market research, identifying your target audience, and outlining your financial projections. Would you like me to elaborate on any of these aspects?
--- You can now check the 'langgraph_checkpoint.sqlite' file for saved states. ---
You can rerun this script, and session_id_1 will still remember previous turns.
Key Takeaway:
SqliteSaver: A simple way to persist graph state. You pass it toworkflow.compile(checkpointer=memory).config={"configurable": {"thread_id": session_id}}: Thisconfigparameter is crucial. It tells LangGraph which specific conversation or workflow instance to load/save the state for. Each uniquethread_idgets its own persistent state.- Long-Running Sessions: This allows you to build agents for multi-day tasks, customer support interactions, or complex data collection processes where users might not complete the task in a single sitting.
5. Collaborative Multi-Agent Systems with CrewAI
For highly complex problems that require diverse expertise, multiple specialized agents can collaborate. CrewAI is a framework designed to orchestrate such multi-agent teams, enabling them to work together to achieve a common goal.
5.1. Why Multi-Agent Systems?
Just as complex projects in the real world require teams of experts (e.g., a marketing team has researchers, copywriters, designers), complex AI tasks benefit from agents specializing in different areas.
- Specialization: Each agent can be fine-tuned for a specific role (e.g., ‘Researcher’, ‘Analyst’, ‘Writer’).
- Modularity: Break down a large problem into smaller, manageable tasks for individual agents.
- Enhanced Reasoning: Collective intelligence can often surpass individual capabilities by combining perspectives and skills.
- Robustness: If one agent struggles, another might offer a different approach or correct its output.
5.2. Core Concepts: Agents, Tasks, and Crews
CrewAI simplifies building multi-agent systems with three primary abstractions:
- Agent: Represents an individual intelligent worker.
role: The professional role of the agent (e.g., “Senior Researcher”).goal: What this agent strives to achieve in its role (e.g., “Find the latest trends”).backstory: Provides context and personality, helping the LLM adopt the persona (e.g., “An experienced analyst…”).tools: A list of LangChain tools this agent has access to.llm: The specific LLM instance this agent uses (can be different for each agent).verbose: Show agent’s internal thoughts.allow_delegation: Can this agent delegate its sub-tasks to other agents?
- Task: A specific unit of work within the workflow.
description: What needs to be done.expected_output: A clear description of the desired output format and content.agent: The agent responsible for this task.context: Input from previous tasks.
- Crew: The orchestrator that manages the agents and tasks.
agents: A list of all agents participating in the crew.tasks: A list of tasks to be completed.process: How agents collaborate:Process.sequentialorProcess.hierarchical.manager_llm: (For hierarchical process) The LLM used by the “manager” to delegate and coordinate.
5.3. Hands-on: Building a Research and Writing Crew (Sequential Process)
Let’s create a team of two agents: a Researcher and a Writer. The Researcher will find information, and the Writer will use that information to create a blog post. This demonstrates a Process.sequential workflow.
Scenario: Generate a blog post about a specific AI topic by having a research agent gather information and a writing agent compose the post.
crewai_sequential_blog_post.py
import os
from dotenv import load_dotenv
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
# Load environment variables
load_dotenv()
def run_sequential_blog_post_crew():
print("--- Running Sequential Blog Post Crew ---")
# 1. Initialize the LLM (for all agents in this example)
llm = ChatOpenAI(model="gpt-4o", temperature=0.7) # GPT-4o often better for complex reasoning/writing
# llm = ChatOllama(model="llama3", temperature=0.7) # For Ollama (ensure it's capable for writing)
# 2. Define Tools
search_tool = DuckDuckGoSearchRun()
available_tools = [search_tool]
# 3. Create Agents
# Each agent has a distinct role, goal, backstory, and assigned tools/LLM.
researcher = Agent(
role='Senior Research Analyst',
goal='Uncover the latest groundbreaking trends and advancements in AI',
backstory='''You are a highly skilled and diligent research analyst with a keen eye for
identifying emerging patterns and disruptive technologies in the AI landscape.
Your reports are thorough, insightful, and always up-to-date.''',
verbose=True,
allow_delegation=False, # This agent focuses solely on research, doesn't delegate.
tools=available_tools, # Only the researcher needs the search tool.
llm=llm
)
writer = Agent(
role='Professional Content Writer',
goal='Craft compelling and informative blog posts about AI trends',
backstory='''You are a prolific content writer with a talent for transforming complex
technical information into engaging and accessible blog posts. Your writing is
clear, concise, and captivating.''',
verbose=True,
allow_delegation=False, # This agent focuses solely on writing.
llm=llm
)
# 4. Define Tasks
# Tasks are assigned to specific agents and define what needs to be done.
# The output of the research_task will implicitly be passed to the write_blog_post_task.
research_task = Task(
description='''Conduct a comprehensive analysis of the most recent advancements
in Agentic AI frameworks, focusing on LangChain, LangGraph, and CrewAI.
Identify key features, use cases, and notable examples.
Your final answer MUST be a detailed, well-structured research report, at least 500 words,
summarizing your findings, including sources (if found).''',
expected_output='A detailed research report, markdown formatted, covering Agentic AI frameworks with key features, use cases, examples, and sources.',
agent=researcher
)
write_blog_post_task = Task(
description='''Write an engaging and informative blog post (minimum 800 words)
based on the research report provided by the Senior Research Analyst.
The blog post should be structured with an introduction, several body paragraphs
(each focusing on a specific framework or aspect), and a conclusion.
The tone should be informative yet accessible to a general tech audience.
Ensure to cite sources from the research report where appropriate.
Your final answer MUST be the complete blog post in markdown format.''',
expected_output='A complete blog post, at least 800 words, in markdown format, based on the provided research.',
agent=writer,
context=[research_task] # Explicitly pass the output of research_task as context
)
# 5. Instantiate the Crew
# `Process.sequential` means tasks are executed in the order they are defined.
project_crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_blog_post_task],
process=Process.sequential,
verbose=True, # Show all agents' internal reasoning
llm=llm # Default LLM for tasks if not explicitly set on agent, or for overall crew management
)
# 6. Kickoff the Crew's work
print("\n\n### CREW KICKOFF: Starting Research and Blog Post Generation ###")
result = project_crew.kickoff()
print("\n### CREW FINISHED ###")
print("\n## FINAL BLOG POST ##")
print(result)
if __name__ == "__main__":
run_sequential_blog_post_crew()
To Run:
- Save as
crewai_sequential_blog_post.py. - Ensure
crewai,langchain-openai,duckduckgo-searchare installed. - Make sure your OpenAI API key is in
.env(or use Ollama). - Execute:
python crewai_sequential_blog_post.py
Expected Output (will be lengthy due to verbose logging):
You will see detailed logs from both the researcher and writer agents, including their thoughts, tool calls (for the researcher), and then the writer processing the researcher’s output to generate the final blog post. The final output will be the markdown-formatted blog post.
### CREW KICKOFF: Starting Research and Blog Post Generation ###
[...]
# Researcher's detailed verbose output (Thoughts, Actions, Observations from web search)
[...]
# Writer's detailed verbose output (Thoughts, processing research, generating blog post)
[...]
### CREW FINISHED ###
## FINAL BLOG POST ##
# Title of the Blog Post
## Introduction
[...]
## Understanding LangChain
[...]
## The Power of LangGraph
[...]
## Collaborative AI with CrewAI
[...]
## Conclusion
[...]
Key Takeaway:
AgentDefinition: Clearrole,goal,backstoryguide the LLM’s persona.TaskDefinition: Specific instructions andexpected_outputensure agents produce usable results.context=[research_task]explicitly tellswrite_blog_post_taskto use the output ofresearch_task.CrewOrchestration (Process.sequential): Tasks are run in order, with the output of one feeding into the next, enabling a structured workflow.verbose=True: Essential for debugging and understanding multi-agent interactions.
5.4. Hands-on: Advanced Multi-Agent Workflow (Hierarchical Process for Marketing Plan)
For more dynamic and adaptive workflows, CrewAI’s Process.hierarchical mode allows a “manager” agent to delegate tasks, review results, and guide the overall project.
Scenario: Develop a marketing plan for a new product, involving a Marketing Strategist (manager), a Copywriter, and a Social Media Expert. The manager will delegate and review.
crewai_hierarchical_marketing_plan.py
import os
from dotenv import load_dotenv
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
# Load environment variables
load_dotenv()
def run_hierarchical_marketing_plan_crew():
print("--- Running Hierarchical Marketing Plan Crew ---")
# 1. Initialize LLMs
# Manager needs a powerful LLM for delegation and review.
# Other agents can use slightly less powerful ones if desired, or the same.
manager_llm = ChatOpenAI(model="gpt-4o", temperature=0.7)
agent_llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.5)
# manager_llm = ChatOllama(model="llama3", temperature=0.7) # For Ollama
# agent_llm = ChatOllama(model="llama3", temperature=0.5) # For Ollama
# 2. Define Tools (all agents have access to web search for general info)
search_tool = DuckDuckGoSearchRun()
shared_tools = [search_tool]
# 3. Create Agents
# The 'manager' agent needs `allow_delegation=True`.
marketing_strategist = Agent(
role='Senior Marketing Strategist',
goal='Create a comprehensive marketing plan for a new product launch',
backstory='''You are an experienced marketing strategist responsible for overseeing campaigns,
delegating tasks, and ensuring all marketing efforts align with the product's goals.
You are an expert in market analysis and campaign design.''',
verbose=True,
allow_delegation=True, # IMPORTANT: This agent can delegate to others.
llm=manager_llm # Use the more powerful LLM for the manager
)
copywriter = Agent(
role='Creative Copywriter',
goal='Develop compelling and persuasive marketing copy for various channels',
backstory='''You are a talented copywriter known for crafting engaging and effective
messages that resonate with target audiences and drive action.''',
verbose=True,
allow_delegation=False,
tools=shared_tools,
llm=agent_llm
)
social_media_expert = Agent(
role='Social Media Expert',
goal='Design an engaging social media strategy and content plan',
backstory='''You are a specialist in social media, adept at identifying trends,
crafting viral content, and optimizing campaigns for maximum reach and engagement.''',
verbose=True,
allow_delegation=False,
tools=shared_tools,
llm=agent_llm
)
# 4. Define Tasks (initial high-level tasks, manager will break them down)
product_name = "EcoSmart Home Hub"
product_description = "A smart home device that optimizes energy usage and promotes sustainable living, managed via an intuitive app."
market_research_task = Task(
description=f'''Conduct market research for a new product called "{product_name}".
Identify the target audience, analyze competitors in the sustainable smart home market,
and find unique selling propositions (USPs).''',
expected_output='A detailed market research report including target audience, competitor analysis, and 3-5 key USPs for the product.',
agent=marketing_strategist, # Manager can start, but will likely delegate parts.
async_execution=True # For hierarchical, tasks can run asynchronously
)
copy_creation_task = Task(
description=f'''Develop core marketing copy for "{product_name}". This includes a compelling tagline,
short descriptive text (2-3 sentences), and a longer product description (1-2 paragraphs).
Focus on sustainability, ease of use, and innovation.''',
expected_output='A set of marketing copy: tagline, short description, long description, all emphasizing product benefits.',
agent=copywriter, # Assigned to copywriter
async_execution=True
)
social_media_task = Task(
description=f'''Create a social media content strategy for the launch of "{product_name}".
Suggest 3-5 post ideas for Instagram and X (formerly Twitter), including hashtags and calls to action.
Focus on driving engagement and explaining the eco-benefits.''',
expected_output='A social media strategy document with 3-5 post ideas for Instagram and X, including content, hashtags, and CTAs.',
agent=social_media_expert, # Assigned to social media expert
async_execution=True
)
final_plan_task = Task(
description=f'''Compile all gathered research, copy, and social media strategy into a cohesive
"EcoSmart Home Hub" Marketing Launch Plan. Provide an executive summary and outline next steps.
Your final answer MUST be the complete marketing launch plan in markdown format.''',
expected_output='A full marketing launch plan in markdown format, summarizing all findings and strategies.',
agent=marketing_strategist, # Manager compiles the final plan
context=[market_research_task, copy_creation_task, social_media_task] # Manager will use outputs of other tasks
)
# 5. Instantiate the Crew with Hierarchical Process
# `Process.hierarchical` enables delegation and a "manager_llm" to oversee.
marketing_crew = Crew(
agents=[marketing_strategist, copywriter, social_media_expert],
tasks=[market_research_task, copy_creation_task, social_media_task, final_plan_task],
process=Process.hierarchical, # IMPORTANT: Hierarchical process
verbose=True,
manager_llm=manager_llm # The LLM used by the manager agent for delegation/review
)
# 6. Kickoff the Crew
print("\n\n### CREW KICKOFF: Starting Hierarchical Marketing Plan Generation ###")
result = marketing_crew.kickoff()
print("\n### CREW FINISHED ###")
print("\n## FINAL MARKETING PLAN ##")
print(result)
if __name__ == "__main_":
run_hierarchical_marketing_plan_crew()
To Run:
- Save as
crewai_hierarchical_marketing_plan.py. - Execute:
python crewai_hierarchical_marketing_plan.py
Expected Output (even more verbose, showing manager delegation and agent execution):
You’ll first see the Marketing Strategist (the manager) analyzing the overall goal and delegating sub-tasks to the Copywriter and Social Media Expert. Each sub-agent will then execute its task (and log its internal thoughts). Finally, the Marketing Strategist will synthesize all their outputs into the comprehensive marketing plan.
Key Takeaway:
Process.hierarchical: This is the core difference. A manager agent (usingmanager_llm) dynamically orchestrates the workflow.allow_delegation=True: Essential for the manager agent.- Specialized Agents & Tasks: Each agent is focused, and tasks are more granular.
- Context for Manager: The
final_plan_taskexplicitly uses the outputs of the other tasks ascontext, allowing the manager to synthesize the final plan. async_execution=True: In a hierarchical crew, tasks can be executed asynchronously, potentially speeding up overall completion.
6. Practical Applications: Real-World Smart Agents
Now, let’s bring everything together with more comprehensive examples that demonstrate real-world use cases, including local LLMs, UI automation, and API integration.
6.1. Hands-on: Full Agent Deployment with Local Ollama Models
Deploying agents with local LLMs (via Ollama) is a practical way to achieve privacy, cost savings, and offline capability.
Scenario: Recreate our conversational web search agent, but entirely powered by a local Ollama model.
Pre-requisites:
- Install Ollama: https://ollama.com/
- Pull a capable model, e.g.,
llama3:ollama pull llama3 - Run Ollama server in a separate terminal:
ollama run llama3(keep this terminal open while the script runs) - Install Python packages:
pip install langchain-community duckduckgo-search
ollama_agent_deployment.py
import os
from dotenv import load_dotenv
from langchain_community.chat_models import ChatOllama # Use ChatOllama for local LLM
from langchain_community.tools import DuckDuckGoSearchRun
from langchain.agents import AgentExecutor, create_react_agent
from langchain import hub
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import AIMessage, HumanMessage
# Load environment variables (might not be strictly needed for Ollama, but good practice)
load_dotenv()
def run_ollama_agent_deployment():
print("--- Running Ollama-Powered Conversational Agent ---")
print("!!! ENSURE 'ollama run llama3' IS RUNNING IN A SEPARATE TERMINAL !!!")
# 1. Initialize the LOCAL LLM using ChatOllama
# Connects to your local Ollama server running the 'llama3' model.
# Adjust `model` if you pulled a different one (e.g., "mistral").
ollama_llm = ChatOllama(model="llama3", temperature=0)
# 2. Define tools (web search for current info)
search_tool = DuckDuckGoSearchRun(name="duckduckgo_search")
tools = [search_tool]
# 3. Get a conversational ReAct agent prompt from LangChain Hub
prompt = hub.pull("hwchase17/react-chat-json")
# 4. Create the agent with the Ollama LLM
agent = create_react_agent(ollama_llm, tools, prompt)
# 5. Create the Agent Executor
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
# 6. Initialize ConversationBufferMemory
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# 7. Wrap the agent executor with RunnableWithMessageHistory for session memory
conversational_agent = RunnableWithMessageHistory(
agent_executor,
lambda session_id: memory,
input_messages_key="input",
history_messages_key="chat_history"
)
# 8. Engage in a conversation
session_id = "ollama_user_session_001"
print("\n--- Ollama Agent Turn 1 ---")
user_input_1 = "Hi. My name is Sam. What is the tallest mountain in North America?"
print(f"User [{session_id}]: {user_input_1}")
response_1 = conversational_agent.invoke(
{"input": user_input_1},
config={"configurable": {"session_id": session_id}}
)
print(f"\nAgent [{session_id}]: {response_1['output']}")
print("\n--- Ollama Agent Turn 2 ---")
user_input_2 = "And what was my name again? Also, what's the population of your closest major city?"
print(f"User [{session_id}]: {user_input_2}")
response_2 = conversational_agent.invoke(
{"input": user_input_2},
config={"configurable": {"session_id": session_id}}
)
print(f"\nAgent [{session_id}]: {response_2['output']}")
print("\n--- Ollama Agent Turn 3 ---")
user_input_3 = "Thanks! What is Denali's elevation?"
print(f"User [{session_id}]: {user_input_3}")
response_3 = conversational_agent.invoke(
{"input": user_input_3},
config={"configurable": {"session_id": session_id}}
)
print(f"\nAgent [{session_id}]: {response_3['output']}")
if __name__ == "__main__":
run_ollama_agent_deployment()
To Run:
- Save as
ollama_agent_deployment.py. - Crucially: Open a separate terminal and run
ollama run llama3(or your chosen model). Keep this terminal open. - Execute the Python script:
python ollama_agent_deployment.py
Expected Output (similar to previous conversational agent, but powered by local LLM):
--- Running Ollama-Powered Conversational Agent ---
!!! ENSURE 'ollama run llama3' IS RUNNING IN A SEPARATE TERMINAL !!!
--- Ollama Agent Turn 1 ---
User [ollama_user_session_001]: Hi. My name is Sam. What is the tallest mountain in North America?
> Entering new AgentExecutor chain...
Thought: I need to find the tallest mountain in North America. This is a factual question that requires up-to-date information, so I should use a search tool.
Action: duckduckgo_search
Action Input: tallest mountain in North America
Observation: The tallest mountain in North America is Denali (formerly Mount McKinley), located in Alaska, United States. Its peak is 20,310 feet (6,190 meters) above sea level.
Thought: I have found the tallest mountain in North America and its location and elevation. I now know the final answer.
Final Answer: The tallest mountain in North America is Denali (formerly Mount McKinley), located in Alaska, United States.
Agent [ollama_user_session_001]: The tallest mountain in North America is Denali (formerly Mount McKinley), located in Alaska, United States.
--- Ollama Agent Turn 2 ---
User [ollama_user_session_001]: And what was my name again? Also, what's the population of your closest major city?
> Entering new AgentExecutor chain...
Thought: The user is asking two questions. First, they are asking for their name, which I should recall from memory. Second, they are asking for the population of my closest major city. As an AI, I don't have a physical location or a "closest major city". I need to respond to both parts of the question. I will address the name from memory and explain that I don't have a physical location for the second part.
Final Answer: Your name is Sam. As an AI, I do not have a physical location or a "closest major city" to provide its population.
Agent [ollama_user_session_001]: Your name is Sam. As an AI, I do not have a physical location or a "closest major city" to provide its population.
--- Ollama Agent Turn 3 ---
User [ollama_user_session_001]: Thanks! What is Denali's elevation?
> Entering new AgentExecutor chain...
Thought: The user is asking for Denali's elevation. I remember from our previous conversation that Denali's elevation was provided when I searched for the tallest mountain in North America. I should retrieve that from our chat history.
Final Answer: Denali's elevation is 20,310 feet (6,190 meters) above sea level.
Agent [ollama_user_session_001]: Denali's elevation is 20,310 feet (6,190 meters) above sea level.
Key Takeaway: The seamless swap from a hosted LLM to ChatOllama demonstrates the power of LangChain’s modular design. The core agent logic remains identical, allowing you to choose your LLM provider based on your specific requirements without re-architecting your agent. The performance will depend on your local hardware.
6.2. Hands-on: UI Automation with Agents (Web Scraper Example)
Agents can interact with web pages to automate tasks like data extraction or form filling. We’ll use Playwright for robust browser automation.
Scenario: An agent needs to visit a simple, local HTML page and extract specific information (e.g., all paragraph texts).
Pre-requisites:
- Install Playwright:
pip install playwright - Install Playwright browsers:
playwright install
local_test_page.html (Create this file in the same directory as your Python script)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>My Simple Test Page</title>
</head>
<body>
<h1>Welcome to the Agent Test Ground</h1>
<p id="first-paragraph">This is the first paragraph of text. It contains some important information for the agent to find.</p>
<p class="data-item">Here is some data point 1.</p>
<p class="data-item">And here is another data point 2.</p>
<div>
<span class="info-label">Author:</span> <span class="info-value">AI Expert</span>
</div>
<a href="https://example.com/next-page">Go to Next Page</a>
<p>This is the final paragraph on the page.</p>
</body>
</html>
ui_automation_agent.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain.tools import tool
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, SystemMessage
from pydantic import BaseModel, Field
from playwright.sync_api import sync_playwright # For browser automation
from urllib.parse import urlparse, urlunparse # For handling local file URLs
# Load environment variables
load_dotenv()
# Global variable to hold Playwright browser and page instance
# In a more robust system, this would be managed by LangGraph state or a singleton pattern.
_browser_context = None
_current_page = None
# --- Custom Tools for Browser Interaction ---
class NavigateToURLInput(BaseModel):
url: str = Field(description="The full URL to navigate to, e.g., 'https://example.com' or 'file:///path/to/local_test_page.html'.")
@tool("navigate_to_url", args_schema=NavigateToURLInput)
def navigate_to_url_tool(url: str) -> str:
"""
Navigates the browser to the specified URL.
Returns the page title on success.
"""
global _browser_context, _current_page
print(f"\n--- Tool Call: navigate_to_url({url}) ---")
try:
if _current_page:
_current_page.goto(url)
else: # First time setup
pw = sync_playwright().start()
_browser_context = pw.chromium.launch() # You can choose 'firefox' or 'webkit' too
_current_page = _browser_context.new_page()
_current_page.goto(url)
return f"Successfully navigated to '{url}'. Page Title: '{_current_page.title()}'"
except Exception as e:
return f"Error navigating to '{url}': {e}"
class GetElementTextInput(BaseModel):
selector: str = Field(description="A CSS selector to identify the element (e.g., 'p', '#id', '.class').")
@tool("get_element_text", args_schema=GetElementTextInput)
def get_element_text_tool(selector: str) -> str:
"""
Retrieves the text content of the first element matching the CSS selector.
Returns "No element found" if no match.
"""
global _current_page
print(f"\n--- Tool Call: get_element_text('{selector}') ---")
if not _current_page:
return "Error: Browser not open. Navigate to a URL first."
try:
element = _current_page.query_selector(selector)
if element:
return element.inner_text()
else:
return "No element found with that selector."
except Exception as e:
return f"Error getting text for selector '{selector}': {e}"
class GetElementTextsInput(BaseModel):
selector: str = Field(description="A CSS selector to identify multiple elements (e.g., 'p', '.data-item').")
@tool("get_element_texts", args_schema=GetElementTextsInput)
def get_element_texts_tool(selector: str) -> List[str]:
"""
Retrieves the text content of ALL elements matching the CSS selector.
Returns a list of strings, or an empty list if no matches.
"""
global _current_page
print(f"\n--- Tool Call: get_element_texts('{selector}') ---")
if not _current_page:
return "Error: Browser not open. Navigate to a URL first."
try:
elements = _current_page.query_selector_all(selector)
texts = [element.inner_text() for element in elements]
if not texts:
return "No elements found with that selector."
return texts
except Exception as e:
return f"Error getting texts for selector '{selector}': {e}"
def run_ui_automation_agent():
print("--- Running UI Automation Agent (Web Scraper) ---")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
tools = [navigate_to_url_tool, get_element_text_tool, get_element_texts_tool]
# Custom prompt to guide the agent on UI automation
custom_system_message = SystemMessage(
content=(
"You are a web scraping agent. Your goal is to extract information from web pages."
"You have tools to navigate to URLs, get text from a single element, or get text from multiple elements."
"Always start by navigating to the specified URL."
"When extracting text, use appropriate CSS selectors."
"Provide the extracted information clearly in your final answer."
)
)
prompt = ChatPromptTemplate.from_messages([
custom_system_message,
("user", "{input}"),
MessagesPlaceholder("agent_scratchpad")
])
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
try:
# Get the absolute path to your local_test_page.html file
current_dir = os.path.dirname(os.path.abspath(__file__))
html_file_path = os.path.join(current_dir, "local_test_page.html")
# Ensure it's a file:// URL for Playwright
local_file_url = urlunparse(('file', '', html_file_path, '', '', ''))
print(f"\n--- Scraping Local Test Page ---")
user_input = f"Visit the page at '{local_file_url}' and extract all paragraph texts and the author's name."
print(f"User: {user_input}")
response = agent_executor.invoke({"input": user_input})
print(f"\nAgent's Final Answer:\n{response['output']}")
finally:
# Ensure the browser is closed even if an error occurs
if _browser_context:
print("\n--- Closing Playwright browser ---")
_browser_context.close()
_browser_context = None
_current_page = None
if __name__ == "__main__":
run_ui_automation_agent()
To Run:
- Create
local_test_page.htmlin the same directory. - Save the Python code as
ui_automation_agent.py. - Execute:
python ui_automation_agent.py
Expected Output (agent navigates, then extracts data):
--- Running UI Automation Agent (Web Scraper) ---
--- Scraping Local Test Page ---
User: Visit the page at 'file:///your/path/to/local_test_page.html' and extract all paragraph texts and the author's name.
> Entering new AgentExecutor chain...
Thought: The user wants me to visit a local HTML page and extract specific information: all paragraph texts and the author's name.
First, I need to navigate to the specified URL using the `navigate_to_url` tool.
Action: navigate_to_url
Action Input: {"url": "file:///path/to/local_test_page.html"} # Your actual path here
--- Tool Call: navigate_to_url(file:///path/to/local_test_page.html) ---
Observation: Successfully navigated to 'file:///path/to/local_test_page.html'. Page Title: 'My Simple Test Page'
Thought: I have successfully navigated to the page. Now I need to extract all paragraph texts. I can use the `get_element_texts` tool with the 'p' selector.
Action: get_element_texts
Action Input: {"selector": "p"}
--- Tool Call: get_element_texts('p') ---
Observation: ["This is the first paragraph of text. It contains some important information for the agent to find.", "Here is some data point 1.", "And here is another data point 2.", "This is the final paragraph on the page."]
Thought: I have extracted all paragraph texts. Next, I need to extract the author's name. Looking at the HTML, the author's name seems to be within a span with class "info-value". I will use `get_element_text` with this selector.
Action: get_element_text
Action Input: {"selector": ".info-value"}
--- Tool Call: get_element_text('.info-value') ---
Observation: AI Expert
Thought: I have successfully extracted all required information: paragraph texts and the author's name. I can now provide the final answer.
Final Answer:
Extracted Paragraphs:
- This is the first paragraph of text. It contains some important information for the agent to find.
- Here is some data point 1.
- And here is another data point 2.
- This is the final paragraph on the page.
Author: AI Expert
Agent's Final Answer:
Extracted Paragraphs:
- This is the first paragraph of text. It contains some important information for the agent to find.
- Here is some data point 1.
- And here is another data point 2.
- This is the final paragraph on the page.
Author: AI Expert
--- Closing Playwright browser ---
Key Takeaway:
- Browser Automation Libraries: Playwright (or Selenium) is essential for simulating user interaction.
- Custom Tools: Each atomic browser action (navigate, get text, click) is wrapped in a tool.
- State Management (Global Vars): For simple cases, global variables can hold the browser instance. For more robust or multi-step UIs, integrating with LangGraph’s state (passing browser context as part of the state) is highly recommended.
- CSS Selectors: The agent needs to understand how to use CSS selectors to identify elements on the page.
- Error Handling: UI automation is brittle; robust error handling in tools is critical.
6.3. Hands-on: Backend Automation & API Integration (Mock CRM Update)
Agents excel at interacting with backend systems via APIs. We’ll simulate a CRM API to update customer statuses.
Scenario: An agent needs to process natural language requests to update customer statuses in a (mock) CRM system via an API tool.
api_integration_agent.py
import os
import requests # For making HTTP requests
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain.tools import tool
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder, SystemMessage
from pydantic import BaseModel, Field
# Load environment variables
load_dotenv()
# --- Mock CRM API ---
# In a real scenario, this would be an actual external API endpoint.
# Here, we'll use a simple dictionary to simulate a database.
_mock_crm_db = {
"CUST001": {"name": "Alice Smith", "status": "Active", "email": "alice@example.com"},
"CUST002": {"name": "Bob Johnson", "status": "Lead", "email": "bob@example.com"},
"CUST003": {"name": "Charlie Brown", "status": "Inactive", "email": "charlie@example.com"},
}
def _make_mock_api_call(endpoint: str, method: str = "GET", data: dict = None) -> dict:
"""Simulates an API call to a CRM backend."""
print(f"\n--- MOCK CRM API Call: {method} {endpoint} ---")
if method == "GET" and endpoint.startswith("/customers/"):
customer_id = endpoint.split("/")[-1]
customer_info = _mock_crm_db.get(customer_id)
if customer_info:
return {"status": "success", "data": customer_info}
else:
return {"status": "error", "reason": "Customer not found"}
elif method == "PUT" and endpoint.startswith("/customers/"):
customer_id = endpoint.split("/")[-1]
if customer_id in _mock_crm_db and data and "status" in data:
_mock_crm_db[customer_id]["status"] = data["status"]
return {"status": "success", "message": f"Customer {customer_id} status updated to {data['status']}"}
else:
return {"status": "error", "reason": "Invalid customer ID or data"}
return {"status": "error", "reason": "Invalid mock API endpoint or method"}
# --- Custom Tools for CRM Interaction ---
class GetCustomerInfoInput(BaseModel):
customer_id: str = Field(description="The unique identifier for the customer (e.g., 'CUST001').")
@tool("get_customer_info", args_schema=GetCustomerInfoInput)
def get_customer_info_tool(customer_id: str) -> str:
"""
Retrieves detailed information for a specific customer from the CRM.
Returns customer data as a JSON string or an error message.
"""
response = _make_mock_api_call(f"/customers/{customer_id}", method="GET")
if response["status"] == "success":
return str(response["data"])
else:
return f"Error: {response['reason']}"
class UpdateCustomerStatusInput(BaseModel):
customer_id: str = Field(description="The unique identifier for the customer (e.g., 'CUST001').")
new_status: str = Field(description="The new status to set for the customer (e.g., 'Active', 'Inactive', 'Premium').")
@tool("update_customer_status", args_schema=UpdateCustomerStatusInput)
def update_customer_status_tool(customer_id: str, new_status: str) -> str:
"""
Updates the status of a specific customer in the CRM system.
Returns a success message or an error message.
"""
response = _make_mock_api_call(f"/customers/{customer_id}", method="PUT", data={"status": new_status})
if response["status"] == "success":
return response["message"]
else:
return f"Error: {response['reason']}"
def run_api_integration_agent():
print("--- Running API Integration Agent (Mock CRM) ---")
print(f"Initial CRM State: {_mock_crm_db}")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# llm = ChatOllama(model="llama3", temperature=0) # For Ollama
tools = [get_customer_info_tool, update_customer_status_tool]
custom_system_message = SystemMessage(
content=(
"You are a CRM automation agent. Your task is to manage customer records using the provided tools."
"You can 'get_customer_info' to retrieve details or 'update_customer_status' to change a customer's status."
"Always confirm the action taken or report any errors clearly."
"If asked for customer status, use the get_customer_info tool."
"If asked to change a customer's status, use the update_customer_status tool."
"Provide concise and direct answers."
)
)
prompt = ChatPromptTemplate.from_messages([
custom_system_message,
("user", "{input}"),
MessagesPlaceholder("agent_scratchpad")
])
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
print("\n--- Interaction 1: Get Customer Info ---")
user_input_1 = "What is the status of customer CUST002?"
print(f"User: {user_input_1}")
response_1 = agent_executor.invoke({"input": user_input_1})
print(f"\nAgent's Final Answer: {response_1['output']}")
print("\n--- Interaction 2: Update Customer Status ---")
user_input_2 = "Please change CUST001's status to 'Premium'."
print(f"User: {user_input_2}")
response_2 = agent_executor.invoke({"input": user_input_2})
print(f"\nAgent's Final Answer: {response_2['output']}")
print(f"\nUpdated CRM State: {_mock_crm_db}") # Check the mock DB
print("\n--- Interaction 3: Attempt to Update Non-existent Customer ---")
user_input_3 = "Set customer CUST999 to 'On Hold'."
print(f"User: {user_input_3}")
response_3 = agent_executor.invoke({"input": user_input_3})
print(f"\nAgent's Final Answer: {response_3['output']}")
print("\n--- Interaction 4: Verify Updated Status ---")
user_input_4 = "What is CUST001's current status?"
print(f"User: {user_input_4}")
response_4 = agent_executor.invoke({"input": user_input_4})
print(f"\nAgent's Final Answer: {response_4['output']}")
if __name__ == "__main__":
run_api_integration_agent()
To Run:
- Save as
api_integration_agent.py. - Execute:
python api_integration_agent.py
Expected Output (demonstrating API tool calls and state changes):
--- Running API Integration Agent (Mock CRM) ---
Initial CRM State: {'CUST001': {'name': 'Alice Smith', 'status': 'Active', 'email': 'alice@example.com'}, 'CUST002': {'name': 'Bob Johnson', 'status': 'Lead', 'email': 'bob@example.com'}, 'CUST003': {'name': 'Charlie Brown', 'status': 'Inactive', 'email': 'charlie@example.com'}}
--- Interaction 1: Get Customer Info ---
User: What is the status of customer CUST002?
> Entering new AgentExecutor chain...
Thought: The user is asking for the status of a specific customer. I should use the `get_customer_info` tool to retrieve this information.
Action: get_customer_info
Action Input: {"customer_id": "CUST002"}
--- MOCK CRM API Call: GET /customers/CUST002 ---
Observation: {'name': 'Bob Johnson', 'status': 'Lead', 'email': 'bob@example.com'}
Thought: I have retrieved the customer information. The status of CUST002 is 'Lead'.
Final Answer: The status of customer CUST002 is 'Lead'.
Agent's Final Answer: The status of customer CUST002 is 'Lead'.
--- Interaction 2: Update Customer Status ---
User: Please change CUST001's status to 'Premium'.
> Entering new AgentExecutor chain...
Thought: The user wants to change the status of customer CUST001 to 'Premium'. I should use the `update_customer_status` tool for this.
Action: update_customer_status
Action Input: {"customer_id": "CUST001", "new_status": "Premium"}
--- MOCK CRM API Call: PUT /customers/CUST001 ---
Observation: Customer CUST001 status updated to 'Premium'
Thought: I have successfully updated the status of customer CUST001.
Final Answer: Customer CUST001's status has been updated to 'Premium'.
Agent's Final Answer: Customer CUST001's status has been updated to 'Premium'.
Updated CRM State: {'CUST001': {'name': 'Alice Smith', 'status': 'Premium', 'email': 'alice@example.com'}, 'CUST002': {'name': 'Bob Johnson', 'status': 'Lead', 'email': 'bob@example.com'}, 'CUST003': {'name': 'Charlie Brown', 'status': 'Inactive', 'email': 'charlie@example.com'}}
--- Interaction 3: Attempt to Update Non-existent Customer ---
User: Set customer CUST999 to 'On Hold'.
> Entering new AgentExecutor chain...
Thought: The user wants to update the status of customer CUST999 to 'On Hold'. I should use the `update_customer_status` tool.
Action: update_customer_status
Action Input: {"customer_id": "CUST999", "new_status": "On Hold"}
--- MOCK CRM API Call: PUT /customers/CUST999 ---
Observation: Error: Invalid customer ID or data
Thought: The tool reported an error that the customer ID is invalid. I need to inform the user that CUST999 does not exist or the data provided was invalid.
Final Answer: I could not update the status for CUST999 because the customer ID is invalid or the data provided was not acceptable. Please ensure the customer ID is correct.
Agent's Final Answer: I could not update the status for CUST999 because the customer ID is invalid or the data provided was not acceptable. Please ensure the customer ID is correct.
--- Interaction 4: Verify Updated Status ---
User: What is CUST001's current status?
> Entering new AgentExecutor chain...
Thought: The user is asking for the current status of customer CUST001. I should use the `get_customer_info` tool to retrieve this information.
Action: get_customer_info
Action Input: {"customer_id": "CUST001"}
--- MOCK CRM API Call: GET /customers/CUST001 ---
Observation: {'name': 'Alice Smith', 'status': 'Premium', 'email': 'alice@example.com'}
Thought: I have retrieved the customer information. The status of CUST001 is 'Premium'.
Final Answer: The current status of customer CUST001 is 'Premium'.
Agent's Final Answer: The current status of customer CUST001 is 'Premium'.
Key Takeaway:
- API Abstraction: Each relevant API operation (get info, update status) is cleanly encapsulated as a LangChain tool.
- Pydantic for Inputs: Crucial for robustly parsing natural language requests into structured API call parameters.
- Error Handling: The mock API returns meaningful errors, which the agent’s LLM interprets and reports back to the user, making the system more user-friendly.
- Stateful Backend: The agent successfully modifies the backend state (our
_mock_crm_dbdictionary) through its actions, demonstrating real-world impact.
6.4. Hands-on: Building a Comprehensive Research and Report Agent (LangGraph + Tools)
This example combines LangGraph’s orchestration power with multiple tools to build a more sophisticated agent that performs research, summarizes, and formats a report.
Scenario: Create an agent that takes a research topic, performs web searches, extracts content, summarizes it, and then presents a structured report with sources.
langgraph_research_agent.py
import os
from dotenv import load_dotenv
from typing import TypedDict, Annotated, List
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage
from langchain_openai import ChatOpenAI
# from langchain_community.chat_models import ChatOllama
from langchain_community.tools import DuckDuckGoSearchRun
from langchain.tools import tool
from langgraph.graph import StateGraph, END
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.utils.function_calling import format_tool_to_openai_function
from playwright.sync_api import sync_playwright # For web scraping
# Load environment variables
load_dotenv()
# --- 1. Define the Graph State ---
class ResearchAgentState(TypedDict):
chat_history: Annotated[List[BaseMessage], lambda x, y: x + y]
research_query: str
raw_research_data: Annotated[List[str], lambda x, y: x + y] # Accumulate raw text from scraping
final_report: str
sources: Annotated[List[str], lambda x, y: x + y] # Track URLs used
# --- 2. Initialize LLM and Tools ---
llm = ChatOpenAI(model="gpt-4o", temperature=0.5) # Use a stronger LLM for research/summarization
# llm = ChatOllama(model="llama3", temperature=0.5) # For Ollama
# Initialize Playwright browser (managed globally for simplicity in this demo, but could be in state)
_pw_context = None
_pw_page = None
def _initialize_playwright():
global _pw_context, _pw_page
if not _pw_context:
pw = sync_playwright().start()
_pw_context = pw.chromium.launch()
_pw_page = _pw_context.new_page()
def _close_playwright():
global _pw_context, _pw_page
if _pw_context:
_pw_context.close()
_pw_context = None
_pw_page = None
sync_playwright().stop()
# Define Tools
search_tool = DuckDuckGoSearchRun(name="web_search")
@tool
def get_webpage_content(url: str) -> str:
"""Fetches the main text content from a given URL."""
global _pw_page
print(f"\n--- Tool Call: get_webpage_content({url}) ---")
_initialize_playwright() # Ensure browser is open
try:
_pw_page.goto(url, wait_until="domcontentloaded")
# Extract readable text, avoiding boilerplate
text_content = _pw_page.locator("body").inner_text()
return text_content[:4000] + "..." if len(text_content) > 4000 else text_content
except Exception as e:
return f"Error fetching content from {url}: {e}"
tools = [search_tool, get_webpage_content]
llm_with_tools = llm.bind_functions([format_tool_to_openai_function(t) for t in tools])
# --- 3. Define the Nodes for the Graph ---
def research_planner_node(state: ResearchAgentState):
"""LLM node to plan research, execute search, and extract links."""
messages = state["chat_history"]
if not state.get("research_query"):
# First turn, extract research query
query_extraction_prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert research planner. Extract the core research query from the user's input."),
("user", "{input}")
])
extracted_query = llm.invoke(query_extraction_prompt.format(input=messages[-1].content)).content
state["research_query"] = extracted_query
print(f"\n--- Research Planner Node: Extracted Query: {extracted_query} ---")
# Ask LLM to plan search
search_plan_prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert research planner. Based on the query: '{query}', generate a concise web search query to find relevant information. Prioritize getting URLs."),
("user", "Generate a web search query for: {query}")
])
search_query = llm.invoke(search_plan_prompt.format(query=extracted_query)).content
# Execute search
search_results = search_tool.invoke({"query": search_query})
state["chat_history"].append(AIMessage(content=f"Search for '{search_query}' results:\n{search_results}"))
state["sources"].append(f"Search query: {search_query}")
print(f"\n--- Research Planner Node: Performed Search ---")
# LLM to extract URLs
url_extraction_prompt = ChatPromptTemplate.from_messages([
("system", "From the following search results, identify and list up to 3 distinct relevant URLs that seem to contain detailed information. Output ONLY the URLs, one per line."),
("user", "Search Results:\n{results}")
])
urls_str = llm.invoke(url_extraction_prompt.format(results=search_results)).content
urls = [url.strip() for url in urls_str.split('\n') if url.strip().startswith('http')]
state["chat_history"].append(AIMessage(content=f"Identified URLs: {urls}"))
print(f" Identified URLs: {urls}")
# Fetch content for each URL
for url in urls:
content = get_webpage_content.invoke({"url": url})
state["raw_research_data"].append(f"--- CONTENT FROM {url} ---\n{content}\n--- END CONTENT ---\n")
state["sources"].append(url)
state["chat_history"].append(AIMessage(content=f"Fetched content from {url}"))
print(f" Fetched content from {url}")
return state # Return updated state
# If already researched, might refine further or summarize
return state # For simplicity, this node is mainly for initial research.
def summarizer_node(state: ResearchAgentState):
"""LLM node to summarize the raw research data."""
raw_data = "\n\n".join(state["raw_research_data"])
prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert summarizer. Condense the following research into a comprehensive and objective summary. Highlight key findings and important details. The summary should be at least 300 words."),
("user", "Summarize this research:\n{raw_data}")
])
summary = llm.invoke(prompt.format(raw_data=raw_data)).content
state["final_report"] = summary
state["chat_history"].append(AIMessage(content=f"Generated Summary."))
print(f"\n--- Summarizer Node: Generated Summary ---")
return state
def reporter_node(state: ResearchAgentState):
"""LLM node to format the final report with sources."""
report_prompt = ChatPromptTemplate.from_messages([
("system", (
"You are an expert technical writer. Create a well-structured final report in Markdown format."
"Include a clear title, an introduction, the main summary, and a 'Sources' section with linked URLs."
"The report should be professional and easy to read."
"Use the provided summary and sources."
)),
("user", "Research topic: {topic}\nSummary:\n{summary}\n\nSources:\n{sources}")
])
formatted_sources = "\n".join([f"- [{s}]({s})" if s.startswith('http') else f"- {s}" for s in set(state["sources"])])
final_report = llm.invoke(report_prompt.format(
topic=state["research_query"],
summary=state["final_report"],
sources=formatted_sources
)).content
state["final_report"] = final_report
state["chat_history"].append(AIMessage(content=f"Generated Final Report."))
print(f"\n--- Reporter Node: Generated Final Report ---")
return state
# --- 4. Define Conditional Edge Logic (simple for this flow) ---
def should_continue_research(state: ResearchAgentState) -> str:
"""A placeholder for more complex research loops."""
# For this example, we just go research -> summarize -> report
return "summarize" # Always move to summarize after initial research
def should_continue_summarize(state: ResearchAgentState) -> str:
"""A placeholder for more complex summary refinement."""
return "report"
def run_langgraph_research_agent():
print("--- Running LangGraph Comprehensive Research Agent ---")
# Clean up playwright resources on exit
import atexit
atexit.register(_close_playwright)
# 5. Build the LangGraph Workflow
workflow = StateGraph(ResearchAgentState)
workflow.add_node("research_planner", research_planner_node)
workflow.add_node("summarizer", summarizer_node)
workflow.add_node("reporter", reporter_node)
workflow.set_entry_point("research_planner")
workflow.add_edge("research_planner", "summarizer")
workflow.add_edge("summarizer", "reporter")
workflow.add_edge("reporter", END)
app = workflow.compile()
# 6. Invoke the Graph
print("\n\n=== Initiating Research Task ===")
research_topic = "the impact of generative AI on creative industries"
print(f"User: Please research: '{research_topic}' and provide a summary report with sources.")
inputs = {"chat_history": [HumanMessage(content=research_topic)], "research_query": research_topic, "raw_research_data": [], "sources": []}
# Stream for step-by-step output
for s in app.stream(inputs):
if "__end__" not in s:
print(s)
final_state = app.invoke(inputs)
print("\n### FINAL RESEARCH REPORT ###")
print(final_state["final_report"])
if __name__ == "__main__":
run_langgraph_research_agent()
To Run:
- Save
langgraph_research_agent.py. - Ensure
playwrightis installed (pip install playwrightandplaywright install). - Execute:
python langgraph_research_agent.py
Expected Output (will be lengthy, showing steps from search to final report):
--- Running LangGraph Comprehensive Research Agent ---
=== Initiating Research Task ===
User: Please research: 'the impact of generative AI on creative industries' and provide a summary report with sources.
--- Research Planner Node: Extracted Query: the impact of generative AI on creative industries ---
--- Research Planner Node: Performed Search ---
Identified URLs: [...] # List of URLs found by LLM from search results
Fetched content from https://www.example.com/article1
Fetched content from https://www.example.com/article2
Fetched content from https://www.example.com/article3
{'research_planner': {'chat_history': [...], 'raw_research_data': [...], 'sources': [...]}}
--- Summarizer Node: Generated Summary ---
{'summarizer': {'chat_history': [...], 'final_report': 'The impact of generative AI on creative industries is multifaceted...', 'raw_research_data': [...], 'sources': [...]}}
--- Reporter Node: Generated Final Report ---
{'reporter': {'chat_history': [...], 'final_report': '# The Impact of Generative AI on Creative Industries\n\n## Introduction\n...', 'raw_research_data': [...], 'sources': [...]}}
{'__end__': {'chat_history': [...], 'final_report': '# The Impact of Generative AI on Creative Industries\n\n## Introduction\n...', 'raw_research_data': [...], 'research_query': 'the impact of generative AI on creative industries', 'sources': [...]}}
### FINAL RESEARCH REPORT ###
# The Impact of Generative AI on Creative Industries
## Introduction
Generative AI represents a transformative shift across various sectors, but its implications for creative industries are particularly profound. From music composition and visual art to writing and design, these AI models are challenging traditional creative processes, introducing new tools for artists, and raising complex questions about authorship, intellectual property, and the future of human creativity. This report explores the multifaceted impact of generative AI on these industries, examining both the opportunities it presents and the significant challenges it introduces.
## Main Summary
Generative AI, powered by large language models (LLMs) and diffusion models, can produce novel content in various modalities. In visual arts, tools like Midjourney and DALL-E allow users to create stunning images from text prompts, enabling rapid prototyping and exploration of styles for designers and artists. For writers, AI assistants can generate drafts, assist with brainstorming, and even produce entire articles, while musicians are leveraging AI to compose melodies, harmonies, and even full tracks, breaking down creative barriers and speeding up production cycles. The film industry is also seeing AI assist with scriptwriting, storyboarding, and special effects generation.
**Opportunities:**
1. **Democratization of Creation:** AI tools lower the barrier to entry for creative work, allowing individuals without specialized skills to produce high-quality content.
2. **Enhanced Productivity & Efficiency:** Artists and designers can automate repetitive tasks, generate variations quickly, and focus more on conceptual work.
3. **New Forms of Art & Expression:** Generative AI can create entirely new aesthetics and styles, pushing the boundaries of what's possible.
4. **Personalization:** AI can tailor content to individual preferences, enhancing user engagement in media and entertainment.
**Challenges:**
1. **Ethical and Legal Concerns:** Issues of copyright, intellectual property, and deepfakes are prominent. Who owns AI-generated art, especially if trained on existing copyrighted works?
2. **Job Displacement:** Fears persist that AI could automate jobs traditionally held by human creatives, from illustrators to copywriters.
3. **Authenticity and Value:** Questions arise about the originality and artistic merit of AI-generated content, and whether it devalues human creative effort.
4. **Bias and Reproducibility:** AI models can perpetuate and amplify biases present in their training data, leading to problematic or unrepresentative outputs.
Overall, generative AI is not merely a tool but a co-creator and disruptor in creative industries. Its full impact is still unfolding, requiring a careful balance between leveraging its potential and addressing its inherent risks.
## Sources
- [Link to an example article about AI in creative industries]
- [Link to another example article or research paper]
- [Link to a blog post or news source]
- Search query: the impact of generative AI on creative industries
Key Takeaway:
- Modular Nodes: The research agent is broken into clear, distinct nodes:
research_planner,summarizer,reporter. - Playwright Integration: The
get_webpage_contenttool demonstrates real-world web interaction for data gathering. - State Accumulation:
raw_research_dataandsourcesuseAnnotatedlists to accumulate information throughout the graph’s execution. - Multi-step Reasoning: The agent performs several steps: planning a search, executing it, extracting URLs, fetching content from those URLs, summarizing the content, and then formatting it into a final report.
- Robustness: Incorporate error handling within tools and ensure
_close_playwright()is called for cleanup. - Flexibility: This graph can be extended with more complex nodes for data analysis, sentiment analysis, peer review (another LLM), or deeper content extraction.
7. Best Practices and Future Directions
Building advanced agentic AI systems is an art and a science. Adhering to best practices ensures your agents are not only intelligent but also robust, maintainable, and ethically sound.
7.1. Key Agent Design Principles
- Single Responsibility Principle (for Tools & Agents):
- Tools: Each tool should do one thing well (e.g.,
search_web,read_file,send_email). This makes them easier to debug and for the LLM to understand. - Agents (in Multi-Agent Systems): Assign clear, focused roles (e.g., “Data Analyst,” “Copywriter,” “Strategist”). Avoid creating monolithic agents that try to do everything.
- Tools: Each tool should do one thing well (e.g.,
- Explicit Goals & Constraints: Clearly define what the agent should achieve and any boundaries or rules it must follow. Embed these directly into system prompts.
- Modular & Composable Components: Leverage frameworks like LangChain/LangGraph for their modularity. This allows you to easily swap out LLMs, memory types, or tools.
- Observability First: Design your agents for debugging. Use
verbose=True, structured logging, and tracing tools (like LangSmith) to understand the agent’s thought process, tool calls, and state changes. This is critical when agents don’t behave as expected. - Progressive Enhancement: Start with the simplest possible agent that achieves a core function. Gradually add complexity (more tools, memory, multi-agent coordination, advanced logic) in iterative steps, testing at each stage.
- Defense in Depth: Implement error handling at multiple levels: within tools, in agent parsing (
handle_parsing_errors=True), and within LangGraph’s routing logic.
7.2. Effective Prompt Engineering for Agents
The prompt is the agent’s operating manual. Its quality directly correlates with your agent’s performance.
- Clarity and Conciseness: Be unambiguous. Avoid jargon unless the agent’s persona is meant to understand it.
- Detailed Tool Descriptions: Provide extremely clear, function-like descriptions for each tool, including its purpose, arguments, and what it returns. The LLM relies solely on these.
- ReAct Structure Enforcement: For ReAct-based agents, explicitly instruct the LLM on the expected “Thought,” “Action,” “Action Input,” and “Observation” format. Ensure the prompt includes
MessagesPlaceholder("agent_scratchpad"). - Role-Playing and Persona: Define a clear persona (e.g., “You are a senior financial analyst…”). This primes the LLM for specific language and decision-making styles.
- Output Format Specification: If you need specific output (e.g., JSON, Markdown, a specific list format), explicitly state it in the prompt and provide examples if necessary.
- Negative Constraints/Guardrails: Tell the agent what not to do (e.g., “Do not make assumptions,” “Do not guess,” “Do not proceed without human approval if the action is critical”).
- Few-Shot Examples (for tricky behaviors): For particularly nuanced or tricky decision points, providing 1-3 well-crafted examples of an ideal interaction (input, thought process, actions, output) can significantly guide the LLM.
- Iterative Refinement: Prompt engineering is an iterative process. Test your agent with various inputs, analyze its verbose output, and refine your prompts based on observed behavior.
7.3. Evaluating and Testing Agent Systems
Agent evaluation is more complex than traditional software testing due to their non-deterministic nature.
- Unit Tests for Tools: Essential. Ensure each individual tool works correctly in isolation and handles expected inputs, edge cases, and errors.
- End-to-End Functional Tests: Define clear, specific user queries/tasks and verify that the agent produces the desired final output.
- Scenario-Based Testing: Develop a diverse set of real-world scenarios that cover various complexities (e.g., questions requiring multiple tool calls, memory recall, error conditions, missing information).
- Golden Datasets: Create a set of input-output pairs where you know the ideal agent behavior (e.g., expected final answer, sequence of tool calls).
- Human-in-the-Loop Evaluation: For subjective tasks or high-stakes decisions, human review of agent outputs and thought processes is invaluable.
- Tracing and Debugging Tools: Use tools like LangSmith to capture full traces of agent execution, allowing you to visually inspect the LLM’s thoughts, tool calls, and intermediate states. This is paramount for understanding why an agent made a particular decision.
- Performance Metrics: Monitor latency, cost (LLM tokens, API calls), and reliability (success rate, error rate).
7.4. Ethical Considerations and Responsible AI
As agents gain more autonomy, ethical design is paramount.
- Transparency: Strive for transparency in agent operation. If an agent uses a tool, makes a decision, or retrieves information, consider making that clear to the user.
- Bias Mitigation: Be aware that LLMs can reflect biases from their training data. Test your agents rigorously for biased behavior, especially in decision-making tools (e.g., loan approvals, hiring assistance).
- Safety and Guardrails: Implement strong safety measures:
- Tool Access Control: Limit agents to only the tools they absolutely need.
- Input/Output Validation: Sanitize user inputs and validate agent outputs before action.
- Human Oversight: For high-impact or sensitive actions, always include a human-in-the-loop for approval (as demonstrated with LangGraph).
- “Do Not Harm” Principles: Embed clear instructions in system prompts to avoid harmful, unethical, or illegal actions.
- Data Privacy and Security: If agents handle sensitive information, ensure strict adherence to data privacy regulations (GDPR, CCPA, HIPAA) and implement robust security practices. Local LLMs (Ollama) can significantly enhance privacy.
- Accountability: Establish clear lines of accountability for agent actions. Who is responsible if an autonomous agent makes a mistake or causes harm?
- Environmental Impact: Be mindful of the computational resources consumed by LLMs. Optimize prompts, use smaller models when possible, and consider local LLMs.
7.5. The Evolving Landscape of Agentic AI
The field is dynamic and rapidly advancing. Key trends to watch include:
- Enhanced Autonomy & Generalization: Agents becoming more capable of handling entirely novel tasks and adapting to new environments with less human programming.
- Sophisticated Planning & Reasoning: Improvements in symbolic reasoning, hierarchical planning, and integration of reinforcement learning for dynamic policy generation.
- Advanced Memory Architectures: Beyond simple chat history, expect more intelligent long-term memory (e.g., self-updating knowledge graphs, better semantic retrieval).
- Self-Reflective and Self-Correcting Agents: Agents that can not only correct parsing errors but also critically evaluate their own plans, tool usage, and outputs.
- Dynamic Tool Creation & Discovery: Agents that can, to some extent, understand requirements for a task and dynamically find or even generate simple tools (e.g., writing a small Python script) to accomplish it.
- Seamless Human-Agent Teaming: More intuitive interfaces and methods for humans to seamlessly collaborate with, instruct, and oversee complex agent teams.
- Formal Verification and Explainability: Efforts to make agent decisions more auditable, understandable, and verifiable.
- Embodied AI: Integration of agents with robotics and physical systems for direct interaction with the physical world, leading to a new era of autonomous robots.
By thoroughly understanding and practically applying the concepts and frameworks discussed in this document, you are well-equipped to navigate this exciting future and build the next generation of intelligent, impactful AI systems. The journey is just beginning!