| LEARNEZY Guides

Absolutely! Here’s the next chapter on “Working with Resources and Tracers” for your Agentic Lightening learning guide.

+++ title = “Working with Resources and Tracers” date = 2025-11-06T22:00:00+05:30 draft = false description = “Understand how to manage dynamic configurations and capture detailed agent interactions using AgentResource and LitTracer in Agentic Lightening. This chapter covers versioning, distribution of resources, and the importance of tracing for data-driven agent optimization and debugging.” slug = “working-with-resources-and-tracers” keywords = [“Agentic Lightening Resources”, “LitTracer”, “Agent Resource Management”, “Trace Collection”, “Dynamic Configuration”, “AI Agent Debugging”, “Agentic AI Data”] tags = [“AI”, “Machine Learning”, “Agentic AI”, “Resources”, “Tracing”, “Configuration”] categories = [“Artificial Intelligence”] author = “AI Expert” showReadingTime = true showTableOfContents = true showComments = false weight = 6 +++

Working with Resources and Tracers

As you build and optimize more complex AI agents, you’ll encounter two critical challenges: managing dynamic configurations that change during training, and capturing rich, detailed data about your agent’s internal workings. Agentic Lightening addresses these with AgentResource for dynamic configurations and LitTracer for detailed interaction tracing.

This chapter will dive into both of these concepts, showing you how to leverage them for more effective agent development, debugging, and optimization.

1. `AgentResource`: Dynamic Configuration for Your Agents

In many AI agent applications, certain parameters aren’t static. These might include:

Prompt templates: The system message or few-shot examples that guide an LLM.
Tool definitions: The precise schema or description of tools available to the agent.
Model configurations: Which specific LLM to use, its temperature, or other hyperparameters.
External API keys or endpoints: Dynamically switching between development and production environments.
Agent personas or instructions: The core identity and constraints of an agent.

Manually managing these variations during an iterative training process is cumbersome and error-prone. This is where AgentResource comes in.

What is an `AgentResource`?

An AgentResource is a versionable, dynamically distributable piece of data that your LitAgent can consume. It’s stored in the LightningStore and managed by the AgentLightningServer. The Trainer can update these resources based on the optimization algorithm’s findings (e.g., finding a better prompt) and push them to agents for subsequent rollouts.

Key Attributes of `AgentResource`:

name: A unique identifier for the resource (e.g., "system_prompt_v2").
value: The actual data of the resource (can be any serializable Python object like a string, dictionary, list, etc.).
type: (Optional) A string indicating the type of resource, useful for categorization (e.g., "prompt_template", "tool_definition").
version: (Managed by the system) The version identifier. The AgentLightningServer ensures agents always get the latest version unless explicitly requested otherwise.

How Agents Use Resources

In your LitAgent’s training_rollout method, you receive a resources dictionary: resources: dict[str, AgentResource]. You can access and use these resources to dynamically configure your agent.

# agent_with_resources.py
import asyncio
from agentlightning.litagent import LitAgent
from agentlightning.types import AgentLightningTask, AgentResource

# Mock LLM for consistency
async def mock_llm_respond(prompt_template: str, user_input: str) -> str:
    return f"Response using '{prompt_template}': {user_input.upper()}"

class ResourceAwareAgent(LitAgent):
    """
    An agent that uses dynamic resources like prompt templates.
    """
    async def training_rollout(
        self,
        task: AgentLightningTask,
        rollout_id: str,
        resources: dict[str, AgentResource],
    ) -> float:
        print(f"[{rollout_id}] Agent received task: {task.name} - '{task.context}'")

        # Get the current system prompt from resources, or use a default
        current_prompt_template = "Generic response template."
        if "system_prompt" in resources:
            current_prompt_template = resources["system_prompt"].value
            print(f"[{rollout_id}] Using dynamic prompt: '{current_prompt_template}'")
        
        # Agent uses the current prompt template
        agent_output = await mock_llm_respond(current_prompt_template, task.context)
        print(f"[{rollout_id}] Agent output: {agent_output}")

        # Reward logic (simple: 1.0 if output contains "GENERIC", 0.0 otherwise)
        reward = 1.0 if "GENERIC" in agent_output.upper() and "Generic response template" in current_prompt_template else 0.0
        print(f"[{rollout_id}] Reward: {reward:.2f}")
        return reward

async def main_resource_demo():
    from agentlightning.trainer import Trainer
    trainer = Trainer(n_workers=1)
    agent = ResourceAwareAgent()

    # --- Initial run with default prompt ---
    task1 = AgentLightningTask(name="Default Prompt Task", context="Hello world")
    print("\n--- Running with default prompt ---")
    rollout_result1 = await trainer.dev(agent=agent, task=task1, resources={})
    print(f"Default Prompt Task Reward: {rollout_result1.final_reward:.2f}\n")

    # --- Run with an updated prompt via resources ---
    updated_prompt_resource = AgentResource(
        name="system_prompt",
        value="You are a polite assistant. Respond politely."
    )
    task2 = AgentLightningTask(name="Polite Prompt Task", context="How are you?")
    print("\n--- Running with 'Polite' prompt resource ---")
    rollout_result2 = await trainer.dev(agent=agent, task=task2, resources={"system_prompt": updated_prompt_resource})
    print(f"Polite Prompt Task Reward: {rollout_result2.final_reward:.2f}\n")

    # --- Run with a different prompt ---
    creative_prompt_resource = AgentResource(
        name="system_prompt",
        value="You are a creative storyteller. Be imaginative."
    )
    task3 = AgentLightningTask(name="Creative Prompt Task", context="Tell a short tale.")
    print("\n--- Running with 'Creative' prompt resource ---")
    rollout_result3 = await trainer.dev(agent=agent, task=task3, resources={"system_prompt": creative_prompt_resource})
    print(f"Creative Prompt Task Reward: {rollout_result3.final_reward:.2f}\n")

if __name__ == "__main__":
    asyncio.run(main_resource_demo())

To run this example:

Save the code as agent_with_resources.py.
Run: python agent_with_resources.py

Expected Output (similar to):

--- Running with default prompt ---
[iter_0_task_0] Agent received task: Default Prompt Task - 'Hello world'
[iter_0_task_0] Reward: 1.00

Default Prompt Task Reward: 1.00

--- Running with 'Polite' prompt resource ---
[iter_0_task_0] Agent received task: Polite Prompt Task - 'How are you?'
[iter_0_task_0] Using dynamic prompt: 'You are a polite assistant. Respond politely.'
[iter_0_task_0] Agent output: Response using 'You are a polite assistant. Respond politely.': HOW ARE YOU?
[iter_0_task_0] Reward: 0.00

Polite Prompt Task Reward: 0.00

--- Running with 'Creative' prompt resource ---
[iter_0_task_0] Agent received task: Creative Prompt Task - 'Tell a short tale.'
[iter_0_task_0] Using dynamic prompt: 'You are a creative storyteller. Be imaginative.'
[iter_0_task_0] Agent output: Response using 'You are a creative storyteller. Be imaginative.': TELL A SHORT TALE.
[iter_0_task_0] Reward: 0.00

Creative Prompt Task Reward: 0.00

Exercise 1: Optimizing Tool Definitions

Imagine an agent using a “search tool.”

Define an AgentResource that contains a dictionary describing the search tool’s schema (e.g., {"name": "search", "description": "Searches the internet", "parameters": {"query": {"type": "string"}}}).
Modify ResourceAwareAgent to:
- Retrieve this tool_schema resource.
- (Conceptually) Configure a mock tool or an actual LangChain Tool object using this schema.
- Have the agent “use” this tool (e.g., by calling mock_search_tool(task.context)).
The reward should be based on whether the agent successfully uses the tool and whether the tool_schema resource was present.

2. `LitTracer`: Capturing Detailed Interaction Data

While training_rollout returns a single reward, real agent debugging and advanced RL often require a much richer log of what happened during the rollout. This is where LitTracer becomes invaluable.

What is a `LitTracer`?

A LitTracer is an optional component that records granular events and data points during an agent’s execution. It acts as an observation system, capturing the flow of information, decisions, and outcomes, which are then stored as part of the LitRollout’s traces.

Why Use Tracers?

Debugging: Understand why an agent made a particular decision or failed a task.
Offline Analysis: Analyze agent behavior patterns across many rollouts.
Advanced RL: Some RL algorithms require more than just the final reward; they need a sequence of (state, action, reward, next_state) transitions. Tracers are perfect for collecting this data.
SFT Data Generation: Traces can include LLM prompts, responses, and intermediate tool outputs, which are valuable for creating fine-tuning datasets.

How to Use `LitTracer`

You typically define a LitTracer by decorating methods in your LitAgent or by explicitly logging events.

Directly Logging with self.trace(): Your LitAgent has a self.trace() method that allows you to log any serializable data.
Using the @trace_step decorator: This decorator automatically logs the inputs and outputs of a method, treating each call as a “step” in the agent’s execution.

Code Example: Agent with Tracing

# agent_with_tracer.py
import asyncio
import time
from agentlightning.litagent import LitAgent
from agentlightning.types import AgentLightningTask, AgentResource
from agentlightning.utils.tracing import trace_step # Import the decorator

# Mock LLM and Tool
async def mock_llm_chain_step(input_text: str) -> str:
    await asyncio.sleep(0.05) # Simulate LLM latency
    return f"LLM processed: {input_text}"

async def mock_tool_lookup(query: str) -> str:
    await asyncio.sleep(0.1) # Simulate tool latency
    if "data about" in query.lower():
        return f"Found data for: {query.replace('data about', '').strip()}"
    return "No relevant data found."

class TracedAgent(LitAgent):
    """
    An agent that uses LitTracer to record its internal steps.
    """
    @trace_step # Decorate this method to automatically trace its calls
    async def _internal_reasoning_step(self, thought: str) -> str:
        # Simulate some internal processing or prompt construction
        processed_thought = f"Agent thought: {thought.upper()}"
        self.trace("internal_debug_point", {"thought_len": len(thought)}) # Manual trace
        await asyncio.sleep(0.02)
        return processed_thought

    @trace_step # Trace this tool call as well
    async def _use_tool(self, tool_name: str, tool_input: str) -> str:
        print(f"  > Agent calling tool '{tool_name}' with input: '{tool_input}'")
        if tool_name == "lookup":
            return await mock_tool_lookup(tool_input)
        return f"Tool '{tool_name}' not recognized."

    async def training_rollout(
        self,
        task: AgentLightningTask,
        rollout_id: str,
        resources: dict[str, AgentResource],
    ) -> float:
        start_time = time.time()
        print(f"[{rollout_id}] Agent received task: {task.name} - '{task.context}'")
        self.trace("rollout_start", {"task_context_len": len(task.context)}) # Manual trace at start

        initial_thought = await self._internal_reasoning_step(f"Analyzing task: {task.context}")
        llm_response = await mock_llm_chain_step(initial_thought)
        self.trace("llm_response", {"llm_output": llm_response}) # Manual trace of LLM output

        tool_result = "No tool used."
        if "data about" in task.context.lower():
            query = task.context.replace("question:", "").strip()
            tool_result = await self._use_tool("lookup", f"data about {query}")
            self.trace("tool_usage_summary", {"tool_name": "lookup", "tool_output_len": len(tool_result)})
        
        final_answer = f"{llm_response}. Tool result: {tool_result}"
        print(f"[{rollout_id}] Agent final answer: {final_answer}")
        self.trace("rollout_end", {"final_answer_len": len(final_answer)}) # Manual trace at end

        end_time = time.time()
        duration = end_time - start_time
        self.trace("total_duration", {"seconds": duration})

        # Simple reward: 1.0 if tool was used successfully, 0.0 otherwise
        reward = 1.0 if "Found data for" in tool_result else 0.0
        print(f"[{rollout_id}] Reward: {reward:.2f}")
        return reward

async def main_tracer_demo():
    from agentlightning.trainer import Trainer
    trainer = Trainer(n_workers=1)
    agent = TracedAgent()

    task1 = AgentLightningTask(name="Data Query", context="Question: data about AI ethics")
    task2 = AgentLightningTask(name="Simple Question", context="Question: What is 2+2?")

    print("\n--- Running TracedAgent for Task 1 (Data Query) ---")
    rollout_result1 = await trainer.dev(agent=agent, task=task1, resources={})
    print(f"\nTask 1 Final Reward: {rollout_result1.final_reward:.2f}")
    print("\n--- Traces for Task 1 ---")
    for trace_entry in rollout_result1.traces:
        print(trace_entry)

    print("\n--- Running TracedAgent for Task 2 (Simple Question) ---")
    rollout_result2 = await trainer.dev(agent=agent, task=task2, resources={})
    print(f"\nTask 2 Final Reward: {rollout_result2.final_reward:.2f}")
    print("\n--- Traces for Task 2 ---")
    for trace_entry in rollout_result2.traces:
        print(trace_entry)

if __name__ == "__main__":
    asyncio.run(main_tracer_demo())

To run this example:

Save the code as agent_with_tracer.py.
Run: python agent_with_tracer.py

Expected Output (will contain detailed trace objects):

--- Running TracedAgent for Task 1 (Data Query) ---
[iter_0_task_0] Agent received task: Data Query - 'Question: data about AI ethics'
[iter_0_task_0] Agent thought: ANALYZING TASK: QUESTION: DATA ABOUT AI ETHICS
  > Agent calling tool 'lookup' with input: 'data about data about AI ethics'
[iter_0_task_0] Agent final answer: LLM processed: Agent thought: ANALYZING TASK: QUESTION: DATA ABOUT AI ETHICS. Tool result: Found data for:  data about AI ethics
[iter_0_task_0] Reward: 1.00

Task 1 Final Reward: 1.00

--- Traces for Task 1 ---
{'id': 'rollout_start', 'type': 'custom', 'data': {'task_context_len': 35}, 'timestamp': '2025-11-06T17:30:00.123456'}
{'id': '_internal_reasoning_step', 'type': 'method_call', 'data': {'args': ['Analyzing task: Question: data about AI ethics'], 'kwargs': {}, 'output': 'Agent thought: ANALYZING TASK: QUESTION: DATA ABOUT AI ETHICS'}, 'timestamp': '2025-11-06T17:30:00.134567'}
{'id': 'internal_debug_point', 'type': 'custom', 'data': {'thought_len': 35}, 'timestamp': '2025-11-06T17:30:00.136789'}
{'id': 'llm_response', 'type': 'custom', 'data': {'llm_output': 'LLM processed: Agent thought: ANALYZING TASK: QUESTION: DATA ABOUT AI ETHICS'}, 'timestamp': '2025-11-06T17:30:00.187654'}
{'id': '_use_tool', 'type': 'method_call', 'data': {'args': ['lookup', 'data about Question: data about AI ethics'], 'kwargs': {}, 'output': 'Found data for:  Question: data about AI ethics'}, 'timestamp': '2025-11-06T17:30:00.298765'}
{'id': 'tool_usage_summary', 'type': 'custom', 'data': {'tool_name': 'lookup', 'tool_output_len': 46}, 'timestamp': '2025-11-06T17:30:00.300012'}
{'id': 'rollout_end', 'type': 'custom', 'data': {'final_answer_len': 113}, 'timestamp': '2025-11-06T17:30:00.301234'}
{'id': 'total_duration', 'type': 'custom', 'data': {'seconds': 0.1777}, 'timestamp': '2025-11-06T17:30:00.302345'}

--- Running TracedAgent for Task 2 (Simple Question) ---
[iter_0_task_0] Agent received task: Simple Question - 'Question: What is 2+2?'
[iter_0_task_0] Agent thought: ANALYZING TASK: QUESTION: WHAT IS 2+2?
[iter_0_task_0] Agent final answer: LLM processed: Agent thought: ANALYZING TASK: QUESTION: WHAT IS 2+2?. Tool result: No tool used.
[iter_0_task_0] Reward: 0.00

Task 2 Final Reward: 0.00

--- Traces for Task 2 ---
{'id': 'rollout_start', 'type': 'custom', 'data': {'task_context_len': 25}, 'timestamp': '2025-11-06T17:30:00.303456'}
{'id': '_internal_reasoning_step', 'type': 'method_call', 'data': {'args': ['Analyzing task: Question: What is 2+2?'], 'kwargs': {}, 'output': 'Agent thought: ANALYZING TASK: QUESTION: WHAT IS 2+2?'}, 'timestamp': '2025-11-06T17:30:00.314567'}
{'id': 'internal_debug_point', 'type': 'custom', 'data': {'thought_len': 25}, 'timestamp': '2025-11-06T17:30:00.316789'}
{'id': 'llm_response', 'type': 'custom', 'data': {'llm_output': 'LLM processed: Agent thought: ANALYZING TASK: QUESTION: WHAT IS 2+2?'}, 'timestamp': '2025-11-06T17:30:00.367654'}
{'id': 'rollout_end', 'type': 'custom', 'data': {'final_answer_len': 91}, 'timestamp': '2025-11-06T17:30:00.368765'}
{'id': 'total_duration', 'type': 'custom', 'data': {'seconds': 0.0666}, 'timestamp': '2025-11-06T17:30:00.369876'}

Notice how _internal_reasoning_step and _use_tool show up as method_call types, while rollout_start, internal_debug_point, llm_response, tool_usage_summary, rollout_end, and total_duration are custom trace types. The data stored within these traces can be arbitrary.

Exercise 2: Tracing Agent Tool Parameters

Modify TracedAgent:

Add a new @trace_step decorated method called _parse_tool_arguments(self, raw_input: str) -> dict. This method should simulate parsing a string like "find in database: 'customers' with 'age > 30'" and return a dictionary like {"action": "find", "database": "customers", "filter": "age > 30"}.
Before calling _use_tool in training_rollout, pass the task context through _parse_tool_arguments.
Ensure the trace output includes the parsed tool arguments for deeper analysis.
Adjust the reward to 1.0 if _parse_tool_arguments successfully extracts both a “database” and “filter” from the task.

The `LightningStore` for Persistent Data

Both AgentResource and LitTracer rely on the LightningStore for persistent storage.

AgentResource Management: When the Trainer proposes a new version of a prompt, tool definition, or model configuration, it updates an AgentResource in the LightningStore via the AgentLightningServer. Agents then query the LightningStore to fetch the latest resources.
Trace Storage: All the detailed LitRollout objects, including their embedded traces, are persisted in the LightningStore. This allows for historical analysis, debugging, and the creation of datasets for algorithms like SFT.

The LightningStore is typically backed by a robust database (e.g., MongoDB, PostgreSQL) and managed by the AgentLightningServer. You, as the developer, primarily interact with it indirectly through the Trainer and LitAgent components, but understanding its role as the central data hub is crucial.

Advanced Usage and Best Practices

Granularity of Traces: Decide what level of detail you need in your traces. Too much detail can create massive logs, too little can make debugging hard. Find a balance.
Sensitive Information: Be mindful of logging sensitive information in traces. Implement appropriate masking or filtering.
Tracing in Production: While crucial for development and training, consider the performance overhead of tracing in high-throughput production environments. Agentic Lightening typically allows you to enable/disable tracing.
Resource Versioning: Always access resources in your agent through the resources dictionary passed to training_rollout. This ensures you’re using the version relevant to the current training iteration, allowing the trainer to safely experiment with different configurations.
Resource Immutability: Treat the value of an AgentResource as immutable within a single training_rollout. If your agent needs to modify a resource, it should be done outside the training_rollout as part of the Trainer’s optimization step, which then creates a new version of the AgentResource.

By mastering AgentResource for dynamic configuration and LitTracer for deep introspection, you gain powerful tools to build, debug, and optimize truly intelligent and adaptive AI agents with Agentic Lightening.

The next exciting step in our journey will be to put all this knowledge into practice with our first Guided Project!

// table of contents

Working with Resources and Tracers

1. AgentResource: Dynamic Configuration for Your Agents

What is an AgentResource?

Key Attributes of AgentResource: