5. Guided Project 1: Building a Cached LLM Chatbot
In this project, you will build a basic chatbot that answers user questions. The core idea is to integrate Redis LangCache to minimize calls to a simulated expensive LLM, thereby improving response times and reducing operational costs.
Project Objective
To develop a simple command-line chatbot that processes user queries. For each query:
- It first checks Redis LangCache for a semantically similar answer.
- If a cached answer is found (cache hit), it returns it immediately.
- If no cached answer is found (cache miss), it calls a mock LLM (simulating an actual LLM API call) to get a fresh response.
- The new prompt-response pair from the mock LLM is then stored in LangCache for future use.
Prerequisites
- Completed “Setting Up Your Development Environment” (Chapter 1).
- Understanding of “Core Concepts of Semantic Caching” (Chapter 2) and “Basic Operations” (Chapter 3).
Project Structure
Create a new directory for this project, e.g., learn-redis-langcache/projects/chatbot-project.
chatbot-project/index.js (for Node.js) or chatbot-project/chatbot.py (for Python)
chatbot-project/mock_llm.js or chatbot-project/mock_llm.py
.env file in the root learn-redis-langcache directory (as set up in Chapter 1).
Step-by-Step Instructions
Step 1: Initialize LangCache Client and Mock LLM
We’ll start by setting up our LangCache client and a simple mock LLM function. The mock LLM will simulate the behavior of a real LLM by providing responses for a few predefined queries and a generic fallback for others.
Node.js (projects/chatbot-project/index.js)
// projects/chatbot-project/index.js
require('dotenv').config({ path: '../../.env' }); // Adjust path to your .env file
const { LangCache } = require('@redis-ai/langcache');
const readline = require('readline');
const { mockLlmResponse } = require('./mock_llm'); // Will create this next
// Retrieve LangCache credentials
const LANGCACHE_API_HOST = process.env.LANGCACHE_API_HOST;
const LANGCACHE_CACHE_ID = process.env.LANGCACHE_CACHE_ID;
const LANGCACHE_API_KEY = process.env.LANGCACHE_API_KEY;
// Initialize LangCache client
const langCache = new LangCache({
serverURL: `https://${LANGCACHE_API_HOST}`,
cacheId: LANGCACHE_CACHE_ID,
apiKey: LANGCACHE_API_KEY,
});
console.log("Chatbot initialized. Type 'exit' to quit.");
// Readline interface for user input
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
async function chat() {
rl.question('You: ', async (query) => {
if (query.toLowerCase() === 'exit') {
console.log('Goodbye!');
rl.close();
return;
}
let response = '';
let source = '';
// Try to get response from LangCache
try {
const cacheResults = await langCache.search({
prompt: query,
similarityThreshold: 0.8 // Adjust as needed
});
if (cacheResults && cacheResults.results.length > 0) {
response = cacheResults.results[0].response;
source = 'Cache';
console.log(`Bot (from Cache, score: ${cacheResults.results[0].score.toFixed(4)}): ${response}`);
} else {
// Cache Miss - call Mock LLM
console.log('Cache Miss. Calling Mock LLM...');
response = await mockLlmResponse(query);
source = 'LLM';
console.log(`Bot (from LLM): ${response}`);
// Store new prompt-response pair in LangCache
await langCache.set({ prompt: query, response: response });
console.log('Response stored in LangCache for future use.');
}
} catch (error) {
console.error('Error interacting with LangCache:', error.message);
// Fallback to LLM directly if cache fails
response = await mockLlmResponse(query);
source = 'LLM (fallback)';
console.log(`Bot (from LLM fallback): ${response}`);
}
chat(); // Continue the chat
});
}
chat();
Node.js (projects/chatbot-project/mock_llm.js)
// projects/chatbot-project/mock_llm.js
async function mockLlmResponse(prompt) {
// Simulate network delay for LLM call
await new Promise(resolve => setTimeout(resolve, 1500));
const lowerPrompt = prompt.toLowerCase();
if (lowerPrompt.includes("hello") || lowerPrompt.includes("hi")) {
return "Hello there! How can I assist you today?";
} else if (lowerPrompt.includes("product features")) {
return "Our latest product features include AI-powered analytics, real-time collaboration tools, and a secure cloud infrastructure.";
} else if (lowerPrompt.includes("pricing")) {
return "Our pricing plans start at $29 per month. Please visit our website for more details.";
} else if (lowerPrompt.includes("contact support")) {
return "You can reach our support team via email at support@example.com or call us at 1-800-555-0100.";
} else if (lowerPrompt.includes("goodbye") || lowerPrompt.includes("bye")) {
return "Goodbye! Have a great day!";
} else {
return `I'm a simple mock LLM. You asked: "${prompt}". I don't have a specific answer for that, but I can learn!`;
}
}
module.exports = { mockLlmResponse };
Python (projects/chatbot-project/chatbot.py)
# projects/chatbot-project/chatbot.py
import os
import asyncio
import sys
import time
from dotenv import load_dotenv
from langcache import LangCache
# Load environment variables from the parent .env file
load_dotenv(dotenv_path='../../.env')
# Retrieve LangCache credentials
LANGCACHE_API_HOST = os.getenv("LANGCACHE_API_HOST")
LANGCACHE_CACHE_ID = os.getenv("LANGCACHE_CACHE_ID")
LANGCACHE_API_KEY = os.getenv("LANGCACHE_API_KEY")
# Initialize LangCache client
lang_cache = LangCache(
server_url=f"https://{LANGCACHE_API_HOST}",
cache_id=LANGCACHE_CACHE_ID,
api_key=LANGCACHE_API_KEY
)
print("Chatbot initialized. Type 'exit' to quit.")
async def mock_llm_response(prompt: str) -> str:
"""Simulates an LLM API call with a delay and predefined responses."""
# Simulate network delay for LLM call
await asyncio.sleep(1.5)
lower_prompt = prompt.lower()
if "hello" in lower_prompt or "hi" in lower_prompt:
return "Hello there! How can I assist you today?"
elif "product features" in lower_prompt:
return "Our latest product features include AI-powered analytics, real-time collaboration tools, and a secure cloud infrastructure."
elif "pricing" in lower_prompt:
return "Our pricing plans start at $29 per month. Please visit our website for more details."
elif "contact support" in lower_prompt:
return "You can reach our support team via email at support@example.com or call us at 1-800-555-0100."
elif "goodbye" in lower_prompt or "bye" in lower_prompt:
return "Goodbye! Have a great day!"
else:
return f"I'm a simple mock LLM. You asked: \"{prompt}\". I don't have a specific answer for that, but I can learn!"
async def chat():
while True:
try:
query = await asyncio.to_thread(input, 'You: ')
except EOFError: # Handle Ctrl+D
print('Goodbye!')
break
if query.lower() == 'exit':
print('Goodbye!')
break
response = ''
source = ''
# Try to get response from LangCache
try:
cache_results = await lang_cache.search(
prompt=query,
similarity_threshold=0.8 # Adjust as needed
)
if cache_results:
response = cache_results[0].response
source = 'Cache'
print(f"Bot (from Cache, score: {cache_results[0].score:.4f}): {response}")
else:
# Cache Miss - call Mock LLM
print('Cache Miss. Calling Mock LLM...')
response = await mock_llm_response(query)
source = 'LLM'
print(f"Bot (from LLM): {response}")
# Store new prompt-response pair in LangCache
await lang_cache.set(prompt=query, response=response)
print('Response stored in LangCache for future use.')
except Exception as e:
print(f"Error interacting with LangCache: {e}")
# Fallback to LLM directly if cache fails
response = await mock_llm_response(query)
source = 'LLM (fallback)'
print(f"Bot (from LLM fallback): {response}")
Step 2: Run and Test the Chatbot
Node.js:
- Navigate to
learn-redis-langcache/projects/chatbot-project. - Run
node index.js.
Python:
- Navigate to
learn-redis-langcache/projects/chatbot-project. - Run
python chatbot.py.
Testing Scenario:
- First Interaction (Cache Miss):
- You:
Hello there! - Bot:
Cache Miss. Calling Mock LLM... - Bot:
Hello there! How can I assist you today?(and “Response stored…”)
- You:
- Second Interaction (Cache Hit):
- You:
Hi! - Bot:
Bot (from Cache, score: X.XXX): Hello there! How can I assist you today?(Notice the “from Cache” and the score, and much faster response).
- You:
- Another First Interaction (Cache Miss):
- You:
What are your product's capabilities? - Bot:
Cache Miss. Calling Mock LLM... - Bot:
Our latest product features include AI-powered analytics...
- You:
- Another Second Interaction (Cache Hit):
- You:
Tell me about the product features. - Bot:
Bot (from Cache, score: X.XXX): Our latest product features include AI-powered analytics...
- You:
- Completely New Question (Cache Miss):
- You:
What is the meaning of life? - Bot:
Cache Miss. Calling Mock LLM... - Bot:
I'm a simple mock LLM. You asked: "What is the meaning of life?". I don't have a specific answer for that, but I can learn!
- You:
Observe the “Cache Miss” messages when a new or semantically distinct query is made, and the “Cache Hit” messages (with a much faster response time) when a similar query is repeated.
Step 3: Experiment and Refine
Challenge Yourself:
Adjust
similarity_threshold:- Change
similarity_thresholdin thesearchcall (e.g., to0.95for stricter matches or0.7for looser matches). Rerun your chatbot and observe how the cache hit/miss behavior changes.
- Change
Add more mock LLM responses:
- Expand your
mockLlmResponse(Node.js) ormock_llm_response(Python) function with more predefined questions and answers. Test if LangCache correctly identifies semantic similarities.
- Expand your
Implement Per-Entry TTL:
- Modify the
langCache.setcall to include attlfor certain types of responses. For example, if a question is about “today’s news,” set a short TTL (e.g., 3600 seconds = 1 hour). - Test by asking a time-sensitive question, waiting for the TTL to expire, and then asking it again.
- Modify the
Add user-specific context using Attributes:
- Modify the
chatfunction to accept auser_id(e.g., a simple hardcoded string or ask the user for it at the start). - Store and retrieve responses using this
user_idas anattribute. This ensures that one user’s cached responses don’t interfere with another’s if the context is user-specific.
Hint for user-specific context:
# Python user_id = "user_A" # Or get from input # Store: await lang_cache.set(prompt=query, response=response, metadata={"user_id": user_id}) # Search: cache_results = await lang_cache.search(prompt=query, attributes={"user_id": user_id})// Node.js const userId = "user_A"; // Or get from input // Store: await langCache.set({ prompt: query, response: response, attributes: { userId: userId } }); // Search: const cacheResults = await langCache.search({ prompt: query, attributes: { userId: userId } });- Modify the
This project provides a practical foundation for integrating Redis LangCache into real-world AI applications. By actively experimenting with parameters and features, you’ll gain deeper insights into its capabilities.