Lab 2: Personalize our agent by adding memory

Author: Aarushi Nema
Date: 2025-09-17
Tags: Customer Support Agent, Amazon BedRock AgentCore, AWS, Machine Learning, Agentic AI, Memory, Personalization, Short-Term Memory, Long-Term Memory, Production, Cloud Computing, Strands Agents, Anthropic Claude, Bedrock
Categories: Machine Learning, Agentic AI, AWS, Workshop
Abstract: This lab focuses on enhancing the customer support agent with memory capabilities using Amazon Bedrock AgentCore Memory. Learn to implement both short-term and long-term memory for personalized customer experiences.

Overview

In Lab 1, you built a Customer Support Agent that worked well for a single user in a local session. However, real-world customer support needs to scale beyond a single user running in a local environment.

When we run an Agent in Production, we’ll need: - Multi-User Support: Handle thousands of customers simultaneously - Persistent Storage: Save conversations beyond session lifecycle - Long-Term Learning: Extract customer preferences and behavioral patterns - Cross-Session Continuity: Remember customers across different interactions

Workshop Progress: - Lab 1 (Done): Create Agent Prototype - Build a functional customer support agent - Lab 2 (Current): Enhance with Memory - Add conversation context and personalization - Lab 3: Scale with Gateway & Identity - Share tools across agents securely - Lab 4: Deploy to Production - Use AgentCore Runtime with observability - Lab 5: Build User Interface - Create a customer-facing application

In this lab, you’ll add the missing persistence and learning layer that transforms your Goldfish-Agent (forgets the conversation in seconds) into an smart personalized Assistant.

Memory is a critical component of intelligence. While Large Language Models (LLMs) have impressive capabilities, they lack persistent memory across conversations. Amazon Bedrock AgentCore Memory addresses this limitation by providing a managed service that enables AI agents to maintain context over time, remember important facts, and deliver consistent, personalized experiences.

AgentCore Memory operates on two levels: - Short-Term Memory: Immediate conversation context and session-based information that provides continuity within a single interaction or closely related sessions. - Long-Term Memory: Persistent information extracted and stored across multiple conversations, including facts, preferences, and summaries that enable personalized experiences over time.

Learning Objectives

  • Understand the importance of memory in AI agents
  • Implement AgentCore Memory resources with multiple strategies
  • Configure short-term and long-term memory for customer support
  • Learn about USER_PREFERENCE and SEMANTIC memory strategies
  • Transform a stateless agent into a personalized assistant

Architecture for Lab 2

<img src="images/architecture_lab2_memory.png" width="75%"/>

Multi-user agent with persistent short term and long term memory capabilities.

Prerequisites

  • AWS Account with appropriate permissions
  • Python 3.10+ installed locally
  • AWS CLI configured with credentials
  • Anthropic Claude 3.7 enabled on Amazon Bedrock
  • Strands Agents and other libraries installed in the next cells
  • These resources are created for you within an AWS workshop account
    • AWS Lambda function
    • AWS Lambda Execution IAM Role
    • AgentCore Gateway IAM Role
    • DynamoDB tables used by the AWS Lambda function.
    • Cognito User Pool and User Pool Client #### Not using an AWS workshop account?

Note: If you are running this as a self-paced lab you must run create the cloudformation resources as shown in the workshop self-paced steps. If you have not, then uncomment and run the below code segment

#!bash scripts/prereq.sh

Step 1: Import Libraries

Let’s import the libraries for AgentCore Memory. For it, we will use the Amazon Bedrock AgentCore Python SDK, a lightweight wrapper that helps you working with AgentCore capabilities.

import logging

# Import agentCore Memory
from bedrock_agentcore.memory import MemoryClient
from bedrock_agentcore.memory.constants import StrategyType

from strands.hooks import AfterInvocationEvent, HookProvider, HookRegistry, MessageAddedEvent

import boto3
from boto3.session import Session

boto_session = Session()
REGION = boto_session.region_name

logger = logging.getLogger(__name__)

from lab_helpers.utils import get_ssm_parameter, put_ssm_parameter

Step 2: Create Bedrock AgentCore Memory resources

Amazon Bedrock AgentCore Memory is a fully managed service that provides persistent memory capabilities for AI agents.

AgentCore Memory Concepts:

  1. Short-Term Memory (STM): Immediately stores conversation context within the session
  2. Long-Term Memory (LTM): Asynchronously processes STM to extract meaningful patterns, preferences and facts
  3. Memory Strategies: Different approaches for extracting and organizing information:
    • USER_PREFERENCE: Learns customer preferences, behaviors, and patterns
    • SEMANTIC: Stores factual information using vector embeddings for similarity search
  4. Namespaces: Logical grouping of memories by customer and context type. We’ll create these two namespaces:
  • support/customer/{actorId}/preferences: Customer preferences and behavioral patterns
  • support/customer/{actorId}/semantic: Factual information and conversation history

This structure enables multi-tenant memory where each customer’s information is isolated and easily retrievable.

Memory Creation Process:

Creating memory resources involves provisioning the underlying infrastructure (vector databases, processing pipelines, etc.). This typically takes 2-3 minutes as AWS sets up the managed services behind the scenes.

memory_client = MemoryClient(region_name=REGION)
memory_name = "CustomerSupportMemory"

def create_or_get_memory_resource():
    try:
        memory_id = get_ssm_parameter("/app/customersupport/agentcore/memory_id")
        memory_client.gmcp_client.get_memory(memoryId=memory_id)
        return memory_id
    except:
        try:
            strategies = [
                {
                    StrategyType.USER_PREFERENCE.value: {
                        "name": "CustomerPreferences",
                        "description": "Captures customer preferences and behavior",
                        "namespaces": ["support/customer/{actorId}/preferences"],
                    }
                },
                {
                    StrategyType.SEMANTIC.value: {
                        "name": "CustomerSupportSemantic",
                        "description": "Stores facts from conversations",
                        "namespaces": ["support/customer/{actorId}/semantic"],
                    }
                },
            ]
            print("Creating AgentCore Memory resources. This will take 2-3 minutes...")
            print("While we wait, let's understand what's happening behind the scenes:")
            print("• Setting up managed vector databases for semantic search")
            print("• Configuring memory extraction pipelines")
            print("• Provisioning secure, multi-tenant storage")
            print("• Establishing namespace isolation for customer data")
            # *** AGENTCORE MEMORY USAGE *** - Create memory resource with semantic strategy
            response = memory_client.create_memory_and_wait(
                name=memory_name,
                description="Customer support agent memory",
                strategies=strategies,
                event_expiry_days=90,          # Memories expire after 90 days
            )
            memory_id = response["id"]
            try:
                put_ssm_parameter("/app/customersupport/agentcore/memory_id", memory_id)
            except:
                raise
            return memory_id
        except Exception as e:
            print(f"Failed to create memory resource: {e}")
            return None
memory_id = create_or_get_memory_resource()
if memory_id:
    print("✅ AgentCore Memory created successfully!")
    print(f"Memory ID: {memory_id}")
else:
    print("Memory resource not created. Try Again !")
Creating AgentCore Memory resources. This will take 2-3 minutes...
While we wait, let's understand what's happening behind the scenes:
• Setting up managed vector databases for semantic search
• Configuring memory extraction pipelines
• Provisioning secure, multi-tenant storage
• Establishing namespace isolation for customer data
✅ AgentCore Memory created successfully!
Memory ID: CustomerSupportMemory-cGl9C845Vd

Step 3: Seed previous customer interactions

Why are we seeding memory?

In production, agents accumulate memory naturally through customer interactions. However, for this lab, we’re seeding historical conversations to demonstrate how Long-Term Memory (LTM) works without waiting for real conversations.

How memory processing works: 1. create_event stores interactions in Short-Term Memory (STM) instantly 2. STM is asynchronously processed by Long-Term Memory strategies 3. LTM extracts patterns, preferences, and facts for future retrieval

Let’s seed some customer history to see this in action:

# List existing memory resources
for memory in memory_client.list_memories():
    print(f"Memory Arn: {memory.get('arn')}")
    print(f"Memory ID: {memory.get('id')}")
    print("--------------------------------------------------------------------")

# Seed with previous customer interactions
CUSTOMER_ID = "customer_001"

previous_interactions = [
    ("I'm having issues with my MacBook Pro overheating during video editing.","USER"),
    ("I can help with that thermal issue. For video editing workloads, let's check your Activity Monitor and adjust performance settings. Your MacBook Pro order #MB-78432 is still under warranty.", "ASSISTANT"),
    ("What's the return policy on gaming headphones? I need low latency for competitive FPS games", "USER"),
    ("For gaming headphones, you have 30 days to return. Since you're into competitive FPS, I'd recommend checking the audio latency specs - most gaming models have <40ms latency.", "ASSISTANT"),
    ("I need a laptop under $1200 for programming. Prefer 16GB RAM minimum and good Linux compatibility. I like ThinkPad models.", "USER"),
    ("Perfect! For development work, I'd suggest looking at our ThinkPad E series or Dell XPS models. Both have excellent Linux support and 16GB RAM options within your budget.", "ASSISTANT"),
]

# Save previous interactions
if memory_id:
    try:
        memory_client.create_event(
            memory_id=memory_id,
            actor_id=CUSTOMER_ID,
            session_id="previous_session",
            messages=previous_interactions
        )
        print("✅ Seeded customer history successfully")
        print("📝 Interactions saved to Short-Term Memory")
        print("⏳ Long-Term Memory processing will begin automatically...")
    except Exception as e:
        print(f"⚠️ Error seeding history: {e}")
Memory Arn: arn:aws:bedrock-agentcore:us-west-2:900569417635:memory/CustomerSupportMemory-cGl9C845Vd
Memory ID: CustomerSupportMemory-cGl9C845Vd
--------------------------------------------------------------------
✅ Seeded customer history successfully
📝 Interactions saved to Short-Term Memory
⏳ Long-Term Memory processing will begin automatically...

Understanding Memory Processing

After creating events with create_event, AgentCore Memory processes the data in two stages:

  1. Immediate: Messages stored in Short-Term Memory (STM)
  2. Asynchronous: STM processed into Long-Term Memory (LTM) strategies

LTM processing typically takes 20-30 seconds as the system: - Analyzes conversation patterns - Extracts customer preferences and behaviors - Creates semantic embeddings for factual information - Organizes memories by namespace for efficient retrieval

Let’s check if our Long-Term Memory processing is complete by retrieving customer preferences:

import time

# Wait for Long-Term Memory processing to complete
print("🔍 Checking for processed Long-Term Memories...")
retries = 0
max_retries = 6  # 1 minute wait

while retries < max_retries:
    memories = memory_client.retrieve_memories(
        memory_id=memory_id,
        namespace=f"support/customer/{CUSTOMER_ID}/preferences",
        query="can you summarize the support issue"
    )
    
    if memories:
        print(f"✅ Found {len(memories)} preference memories after {retries * 10} seconds!")
        break
    
    retries += 1
    if retries < max_retries:
        print(f"⏳ Still processing... waiting 10 more seconds (attempt {retries}/{max_retries})")
        time.sleep(10)
    else:
        print("⚠️ Memory processing is taking longer than expected. This can happen with overloading..")
        break

print("🎯 AgentCore Memory automatically extracted these customer preferences from our seeded conversations:")
print("=" * 80)

for i, memory in enumerate(memories, 1):
    if isinstance(memory, dict):
        content = memory.get('content', {})
        if isinstance(content, dict):
            text = content.get('text', '')
            print(f"  {i}. {text}")
🔍 Checking for processed Long-Term Memories...
⏳ Still processing... waiting 10 more seconds (attempt 1/6)
⏳ Still processing... waiting 10 more seconds (attempt 2/6)
⏳ Still processing... waiting 10 more seconds (attempt 3/6)
⏳ Still processing... waiting 10 more seconds (attempt 4/6)
✅ Found 3 preference memories after 40 seconds!
🎯 AgentCore Memory automatically extracted these customer preferences from our seeded conversations:
================================================================================
  1. {"context":"User reported technical issue with MacBook Pro during video editing","preference":"Uses MacBook Pro for video editing, experiencing performance/thermal challenges","categories":["technology","computing","video editing","hardware"]}
  2. {"context":"User inquired about gaming headphones with specific performance requirement","preference":"Needs low latency gaming headphones for competitive FPS games","categories":["gaming","audio equipment","technology"]}
  3. {"context":"User explicitly mentioned requirements for laptop purchase for programming","preference":"Wants laptop under $1200, with 16GB RAM minimum, good Linux compatibility, preferring ThinkPad models","categories":["technology","computing","laptops","programming"]}

Exploring Semantic Memory

Semantic memory stores factual information from conversations using vector embeddings. This enables similarity-based retrieval of relevant facts and context.

import time
# Retrieve semantic memories (factual information)
while True:
    semantic_memories = memory_client.retrieve_memories(
        memory_id=memory_id,
        namespace=f"support/customer/{CUSTOMER_ID}/semantic",
        query="information on the technical support issue"
    )
    print("🧠 AgentCore Memory identified these factual details from conversations:")
    print("=" * 80)
    if memories:
        break
    time.sleep(10)
for i, memory in enumerate(semantic_memories, 1):
    if isinstance(memory, dict):
        content = memory.get('content', {})
        if isinstance(content, dict):
            text = content.get('text', '')
            print(f"  {i}. {text}")
🧠 AgentCore Memory identified these factual details from conversations:
================================================================================
  1. The user is interested in gaming headphones with low latency for competitive FPS games.
  2. The user is looking for a laptop under $1200 for programming, with a preference for 16GB RAM and good Linux compatibility.
  3. The user is experiencing overheating issues with their MacBook Pro during video editing.

Step 3: Implement Strands Hooks to save and retrieve agent interactions

Now we’ll integrate AgentCore Memory with our agent using Strands’ hook system. This creates an automatic memory layer that works seamlessly with any agent conversation.

  • MessageAddedEvent: Triggered when messages are added to the conversation, allowing us to retrieve and inject customer context
  • AfterInvocationEvent: Fired after agent responses, enabling automatic storage of interactions to memory

The hook system ensures memory operations happen automatically without manual intervention, creating a seamless experience where customer context is preserved across conversations.

To create the hooks we will extend the HookProvider class:

class CustomerSupportMemoryHooks(HookProvider):
    """Memory hooks for customer support agent"""

    def __init__(
        self, memory_id: str, client: MemoryClient, actor_id: str, session_id: str
    ):
        self.memory_id = memory_id
        self.client = client
        self.actor_id = actor_id
        self.session_id = session_id
        self.namespaces = {
            i["type"]: i["namespaces"][0]
            for i in self.client.get_memory_strategies(self.memory_id)
        }

    def retrieve_customer_context(self, event: MessageAddedEvent):
        """Retrieve customer context before processing support query"""
        messages = event.agent.messages
        if (
            messages[-1]["role"] == "user"
            and "toolResult" not in messages[-1]["content"][0]
        ):
            user_query = messages[-1]["content"][0]["text"]

            try:
                all_context = []

                for context_type, namespace in self.namespaces.items():
                    # *** AGENTCORE MEMORY USAGE *** - Retrieve customer context from each namespace
                    memories = self.client.retrieve_memories(
                        memory_id=self.memory_id,
                        namespace=namespace.format(actorId=self.actor_id),
                        query=user_query,
                        top_k=3,
                    )
                    # Post-processing: Format memories into context strings
                    for memory in memories:
                        if isinstance(memory, dict):
                            content = memory.get("content", {})
                            if isinstance(content, dict):
                                text = content.get("text", "").strip()
                                if text:
                                    all_context.append(
                                        f"[{context_type.upper()}] {text}"
                                    )

                # Inject customer context into the query
                if all_context:
                    context_text = "\n".join(all_context)
                    original_text = messages[-1]["content"][0]["text"]
                    messages[-1]["content"][0][
                        "text"
                    ] = f"Customer Context:\n{context_text}\n\n{original_text}"
                    logger.info(f"Retrieved {len(all_context)} customer context items")

            except Exception as e:
                logger.error(f"Failed to retrieve customer context: {e}")

    def save_support_interaction(self, event: AfterInvocationEvent):
        """Save customer support interaction after agent response"""
        try:
            messages = event.agent.messages
            if len(messages) >= 2 and messages[-1]["role"] == "assistant":
                # Get last customer query and agent response
                customer_query = None
                agent_response = None

                for msg in reversed(messages):
                    if msg["role"] == "assistant" and not agent_response:
                        agent_response = msg["content"][0]["text"]
                    elif (
                        msg["role"] == "user"
                        and not customer_query
                        and "toolResult" not in msg["content"][0]
                    ):
                        customer_query = msg["content"][0]["text"]
                        break

                if customer_query and agent_response:
                    # *** AGENTCORE MEMORY USAGE *** - Save the support interaction
                    self.client.create_event(
                        memory_id=self.memory_id,
                        actor_id=self.actor_id,
                        session_id=self.session_id,
                        messages=[
                            (customer_query, "USER"),
                            (agent_response, "ASSISTANT"),
                        ],
                    )
                    logger.info("Saved support interaction to memory")

        except Exception as e:
            logger.error(f"Failed to save support interaction: {e}")

    def register_hooks(self, registry: HookRegistry) -> None:
        """Register customer support memory hooks"""
        registry.add_callback(MessageAddedEvent, self.retrieve_customer_context)
        registry.add_callback(AfterInvocationEvent, self.save_support_interaction)
        logger.info("Customer support memory hooks registered")

Step 4: Create a Customer Support Agent with memory

Next, we will implement the Customer Support Agent just as we did in Lab 1, but this time we instantiate the class CustomerSupportMemoryHooks and we pass the memory hook to the agent contructor.

import uuid

from strands import Agent
from strands.models import BedrockModel

from lab_helpers.lab1_strands_agent import (
    SYSTEM_PROMPT,
    get_return_policy, web_search,
    get_product_info, get_technical_support, MODEL_ID
)

SESSION_ID = str(uuid.uuid4())
memory_hooks = CustomerSupportMemoryHooks(memory_id, memory_client, CUSTOMER_ID, SESSION_ID)


# Initialize the Bedrock model (Anthropic Claude 3.7 Sonnet)
model = BedrockModel(
    model_id=MODEL_ID,
    region_name=REGION
)

# Create the customer support agent with all 5 tools
agent = Agent(
    model=model,
    hooks=[memory_hooks], # Pass Memory Hooks
    tools=[
        get_product_info,      # Tool 1: Simple product information lookup
        get_return_policy,      # Tool 2: Simple return policy lookup
        web_search,
        get_technical_support
    ],
    system_prompt=SYSTEM_PROMPT
)

Step 5: Test Personalized Agent

Let’s test our memory-enhanced agent! Watch how it uses the customer’s historical preferences to provide personalized recommendations.

The agent will automatically: 1. Retrieve relevant customer context from memory 2. Use that context to personalize the response 3. Save this new interaction for future use

from IPython.display import display, Markdown

print("🎧 Testing headphone recommendation with customer memory...\n\n")
response1 = agent("Which headphones would you recommend?")
🎧 Testing headphone recommendation with customer memory...


I'd be happy to help you find some gaming headphones with low latency, perfect for competitive FPS games where timing is critical. Let me get some specific information about gaming headphones for you.
Tool #1: get_product_info
Based on your need for low-latency gaming headphones for competitive FPS games, let me search for some specific recommendations that would work well for your requirements:
Tool #2: web_search
Based on the information I've gathered, here are my recommendations for gaming headphones that would be ideal for competitive FPS games where low latency is crucial:

### Top Recommendations for Low Latency Gaming Headphones:

1. **SteelSeries Arctis Nova Pro** - Considered a top choice for competitive gaming with excellent multi-device connectivity. The wireless version offers very low latency that's ideal for FPS games.

2. **Turtle Beach Stealth 700 Gen 2** - Specifically mentioned as a next-generation competitive gaming headset with improved audio performance for better game performance.

3. **Headphones with aptX Low Latency Support** - Look for headphones supporting this codec, as it can reduce latency to under 40ms, which experts consider ideal for competitive gaming.

### What to Look For:
- **Latency under 40ms** - For competitive FPS games, this is considered the benchmark for lag-free audio
- **Good positional audio** - Critical for accurately locating enemies in FPS games
- **Comfort** - Important for long gaming sessions
- **Quality microphone** - For clear communication with teammates

### Connection Type Consideration:
- **Wired options** generally offer the lowest latency and are most reliable for competitive play
- **Wireless options** with specialized gaming-focused transmission technology can also perform well

Would you like more specific information about any of these models? Or would you prefer I search for options in a particular price range? I can also provide information about return policies if you'd like to try a pair before fully committing to them.
print("\n💻 Testing laptop preference recall...\n\n")
response2 = agent("What is my preferred laptop brand and requirements?")

💻 Testing laptop preference recall...


Based on your previous interactions, I can see your preferred laptop specifications quite clearly.

Your preferred laptop requirements are:
- Brand preference: ThinkPad models
- Budget: Under $1200
- RAM: Minimum of 16GB
- Operating system compatibility: Good Linux compatibility
- Purpose: Programming

ThinkPad is definitely your preferred laptop brand, and you're looking for a model that meets these specific requirements for programming work. ThinkPads are known for their excellent Linux compatibility, which aligns perfectly with your preferences.

Is there anything specific about ThinkPad models you'd like to know more about, or would you like me to provide some recommendations for ThinkPad models that meet your requirements for programming? I'd be happy to search for current models that fit your budget and specifications.

Notice how the Agent remembers: • Your gaming preferences (low latency headphones) • Your laptop preferences (ThinkPad, 16GB RAM, Linux compatibility) • Your budget constraints ($1200 for laptops) • Previous technical issues (MacBook overheating)

This is the power of AgentCore Memory - persistent, personalized customer experiences!

Congratulations! 🎉

You have successfully completed Lab 2: Add memory to the Customer Support Agent!

What You Accomplished:

  • Created a serverless managed memory with Amazon Bedrock AgentCore Memory
  • Implemented long-term memory to store User-Preferences and Semantic (Factual) information.
  • Integrated AgentCore Memory with the customer support Agent using the hook mechanism provided by Strands Agents
Next Up Lab 3 - Scaling with Gateway and Identity →

Resources