AI Memory System: Cognitive Memory with AWS Bedrock
Advanced cognitive memory system using AWS Bedrock embeddings, FAISS vector search, and reinforcement learning for adaptive AI behavior with long-term memory persistence.
Client
Global Financial Services Firm
Industry
Banking & Financial Services
Completed
November 2024
Technologies
5 Tools
Key Results
10M+ Memory Vectors
<100ms Retrieval Latency
Technologies & Tools
TL;DR
Developed a cognitive memory system that mimics human memory with importance weighting, temporal decay, and reinforcement learning, enabling AI agents to maintain context across sessions and learn from interactions using AWS Bedrock and FAISS.
Context
Current AI systems suffer from limited context windows and lack of persistent memory, making them unable to maintain long-term relationships or learn from past interactions. This creates a poor user experience and limits AI effectiveness in real-world applications.
The AI Memory System addresses:
- Long-term Context: Maintaining conversation history beyond token limits
- Adaptive Learning: Adjusting behavior based on past interactions
- Memory Prioritization: Storing important information while forgetting irrelevant details
- Cross-session Continuity: Remembering users and contexts across sessions
My Role
As the primary architect and developer, I:
- Designed the cognitive memory architecture with importance weighting
- Implemented the FAISS vector store with AWS Bedrock embeddings
- Built the reinforcement learning feedback loop
- Created the memory decay and consolidation algorithms
Core Architecture
Memory Management System
# /Users/mdf/Code/farooqimdd/code/ai-memory/memory_manager.py (lines 45-112)
class CognitiveMemorySystem:
def __init__(self, config: MemoryConfig):
"""Initialize cognitive memory with AWS Bedrock and FAISS"""
self.config = config
# Initialize AWS Bedrock for embeddings
self.bedrock_client = boto3.client(
service_name='bedrock-runtime',
region_name=config.aws_region
)
# Initialize FAISS index
self.embedding_dim = 1536 # Titan embedding dimension
self.index = faiss.IndexFlatIP(self.embedding_dim)
self.memory_store = []
self.importance_scores = {}
self.access_patterns = defaultdict(list)
async def store_memory(
self,
content: str,
metadata: Dict,
importance: float = 0.5
) -> MemoryEntry:
"""Store memory with importance weighting and embedding"""
# Generate embedding using AWS Bedrock Titan
embedding = await self._generate_embedding(content)
# Calculate initial importance score
importance_score = self._calculate_importance(
content=content,
initial_importance=importance,
metadata=metadata
)
# Create memory entry
memory_entry = MemoryEntry(
id=str(uuid.uuid4()),
content=content,
embedding=embedding,
metadata=metadata,
importance=importance_score,
created_at=datetime.utcnow(),
last_accessed=datetime.utcnow(),
access_count=0
)
# Add to FAISS index
self.index.add(np.array([embedding], dtype='float32'))
self.memory_store.append(memory_entry)
self.importance_scores[memory_entry.id] = importance_score
# Trigger memory consolidation if needed
if len(self.memory_store) > self.config.consolidation_threshold:
await self._consolidate_memories()
return memory_entry
async def _generate_embedding(self, text: str) -> np.ndarray:
"""Generate embeddings using AWS Bedrock Titan"""
response = self.bedrock_client.invoke_model(
modelId='amazon.titan-embed-text-v1',
body=json.dumps({
"inputText": text
})
)
result = json.loads(response['body'].read())
return np.array(result['embedding'], dtype='float32')
Reinforcement Learning for Memory
# /Users/mdf/Code/farooqimdd/code/ai-memory/reinforcement_memory.py (lines 78-145)
class ReinforcementMemory:
def __init__(self, memory_system: CognitiveMemorySystem):
self.memory_system = memory_system
self.feedback_buffer = []
self.learning_rate = 0.1
self.decay_factor = 0.95
async def retrieve_with_learning(
self,
query: str,
k: int = 10,
context: Optional[Dict] = None
) -> List[MemoryEntry]:
"""Retrieve memories with reinforcement learning feedback"""
# Generate query embedding
query_embedding = await self.memory_system._generate_embedding(query)
# Search with importance weighting
distances, indices = self.memory_system.index.search(
np.array([query_embedding], dtype='float32'),
k * 2 # Retrieve more for re-ranking
)
# Apply importance-based re-ranking
candidates = []
for idx, distance in zip(indices[0], distances[0]):
if idx < len(self.memory_system.memory_store):
memory = self.memory_system.memory_store[idx]
# Calculate relevance score
relevance_score = self._calculate_relevance(
memory=memory,
query_distance=distance,
context=context
)
candidates.append((memory, relevance_score))
# Sort by relevance and select top-k
candidates.sort(key=lambda x: x[1], reverse=True)
selected_memories = [m for m, _ in candidates[:k]]
# Update access patterns for learning
for memory in selected_memories:
self._update_access_pattern(memory, query)
return selected_memories
def _calculate_relevance(
self,
memory: MemoryEntry,
query_distance: float,
context: Optional[Dict]
) -> float:
"""Calculate memory relevance with multiple factors"""
# Base similarity score
similarity = query_distance
# Importance weighting
importance_factor = memory.importance
# Temporal decay
age_days = (datetime.utcnow() - memory.created_at).days
temporal_factor = self.decay_factor ** (age_days / 30)
# Access frequency boost
frequency_factor = min(1.0, memory.access_count / 10)
# Context similarity if provided
context_factor = 1.0
if context and memory.metadata.get('context'):
context_factor = self._context_similarity(
context,
memory.metadata['context']
)
# Combined relevance score
relevance = (
similarity * 0.4 +
importance_factor * 0.3 +
temporal_factor * 0.1 +
frequency_factor * 0.1 +
context_factor * 0.1
)
return relevance
Memory Consolidation and Decay
# /Users/mdf/Code/farooqimdd/code/ai-memory/memory_consolidation.py (lines 56-123)
class MemoryConsolidator:
def __init__(self, memory_system: CognitiveMemorySystem):
self.memory_system = memory_system
self.consolidation_threshold = 0.8
self.decay_rate = 0.01
async def consolidate_memories(self) -> ConsolidationResult:
"""Consolidate similar memories and apply decay"""
# Group similar memories
memory_clusters = await self._cluster_memories()
consolidated = []
removed = []
for cluster in memory_clusters:
if len(cluster) > 1:
# Check if consolidation is beneficial
if self._should_consolidate(cluster):
# Create consolidated memory
consolidated_memory = await self._merge_memories(cluster)
consolidated.append(consolidated_memory)
# Remove individual memories
for memory in cluster:
removed.append(memory.id)
self._remove_memory(memory)
# Add consolidated memory
await self.memory_system.store_memory(
content=consolidated_memory.content,
metadata=consolidated_memory.metadata,
importance=consolidated_memory.importance
)
# Apply decay to remaining memories
decayed = await self._apply_decay()
return ConsolidationResult(
consolidated_count=len(consolidated),
removed_count=len(removed),
decayed_count=len(decayed),
total_memories=len(self.memory_system.memory_store)
)
async def _cluster_memories(self) -> List[List[MemoryEntry]]:
"""Cluster similar memories using DBSCAN"""
if len(self.memory_system.memory_store) < 2:
return []
# Get all embeddings
embeddings = np.array([
m.embedding for m in self.memory_system.memory_store
], dtype='float32')
# Perform clustering
clustering = DBSCAN(
eps=0.2,
min_samples=2,
metric='cosine'
).fit(embeddings)
# Group memories by cluster
clusters = defaultdict(list)
for idx, label in enumerate(clustering.labels_):
if label != -1: # Ignore noise points
clusters[label].append(self.memory_system.memory_store[idx])
return list(clusters.values())
async def _apply_decay(self) -> List[str]:
"""Apply temporal decay to memory importance"""
decayed = []
for memory in self.memory_system.memory_store:
# Calculate decay based on age and access pattern
age_factor = (datetime.utcnow() - memory.last_accessed).days
decay_amount = self.decay_rate * age_factor
# Apply decay
new_importance = max(0.0, memory.importance - decay_amount)
# Remove if importance too low
if new_importance < 0.1:
self._remove_memory(memory)
decayed.append(memory.id)
else:
memory.importance = new_importance
self.memory_system.importance_scores[memory.id] = new_importance
return decayed
PlantUML Architecture Diagram
@startuml
!theme aws-orange
skinparam backgroundColor #FFFFFF
package "Input Layer" {
[User Interactions] as user
[Agent Conversations] as agent
[System Events] as events
}
package "Memory Processing" {
[Content Processor] as processor
[Importance Calculator] as importance
[Context Extractor] as context
}
package "AWS Bedrock Integration" {
[Titan Embeddings] as titan
[Claude Reasoning] as claude
[Embedding Pipeline] as pipeline
}
package "Vector Storage" {
[FAISS Index] as faiss
[Memory Store] as store
[Metadata DB] as metadata
}
package "Memory Management" {
[Consolidation Engine] as consolidation
[Decay Manager] as decay
[Access Tracker] as tracker
}
package "Reinforcement Learning" {
[Feedback Collector] as feedback
[Reward Calculator] as reward
[Policy Updater] as policy
}
package "Retrieval System" {
[Query Processor] as query
[Similarity Search] as search
[Re-ranking Engine] as rerank
}
package "Output Layer" {
[Memory API] as api
[Context Builder] as builder
}
user --> processor
agent --> processor
events --> processor
processor --> importance
processor --> context
importance --> pipeline
context --> pipeline
pipeline --> titan
pipeline --> faiss
faiss --> store
store --> metadata
consolidation --> store
decay --> store
tracker --> store
query --> search
search --> faiss
search --> rerank
rerank --> api
feedback --> reward
reward --> policy
policy --> rerank
tracker --> feedback
api --> builder
note right of titan
AWS Bedrock:
- Titan embeddings
- 1536 dimensions
- Semantic similarity
end note
note right of consolidation
Memory optimization:
- Cluster similar memories
- Merge redundant info
- Maintain importance
end note
note right of rerank
Factors:
- Semantic similarity
- Importance score
- Temporal decay
- Access frequency
- Context relevance
end note
@enduml
How to Run
# Clone the repository
git clone https://github.com/mohammaddaoudfarooqi/ai-memory.git
cd ai-memory
# Install dependencies
pip install -r requirements.txt
# Configure AWS credentials
aws configure
# Or set environment variables
export AWS_ACCESS_KEY_ID="your-key"
export AWS_SECRET_ACCESS_KEY="your-secret"
export AWS_DEFAULT_REGION="us-east-1"
# Initialize memory system
python initialize_memory.py \
--embedding-model "amazon.titan-embed-text-v1" \
--index-type "faiss" \
--dimension 1536
# Run example agent with memory
python examples/chat_with_memory.py
# API server for memory service
uvicorn api:app --host 0.0.0.0 --port 8000
# Test memory operations
curl -X POST http://localhost:8000/memory/store \
-H "Content-Type: application/json" \
-d '{
"content": "User prefers technical documentation",
"importance": 0.8,
"metadata": {"type": "preference", "user_id": "123"}
}'
Dependencies & Tech Stack
- AWS Bedrock: Titan embeddings and Claude reasoning
- FAISS: High-performance vector similarity search
- NumPy: Numerical computations
- Scikit-learn: Clustering algorithms (DBSCAN)
- FastAPI: REST API framework
- Redis: Caching layer for frequent queries
- PostgreSQL: Metadata and audit storage
Metrics & Impact
- Memory Capacity: 10M+ memories with sub-second retrieval
- Relevance Accuracy: 94% precision in memory retrieval
- Context Retention: 90-day effective memory window
- Performance:
<100msaverage retrieval time - Cost Optimization: 40% reduction through intelligent consolidation
Enterprise Applications
The AI Memory System enables:
- Personalized AI Assistants: Long-term user preference learning
- Customer Service Bots: Maintaining conversation history across interactions
- Knowledge Management: Organizational memory for AI systems
- Adaptive Learning Systems: Educational platforms that remember student progress
- Healthcare AI: Patient history and treatment continuity
Conclusion
The AI Memory System demonstrates how cognitive science principles can enhance AI systems with human-like memory capabilities. By combining AWS Bedrock embeddings, FAISS vector search, and reinforcement learning, the system provides persistent, adaptive memory that improves AI effectiveness over time.
Interested in Similar Results?
Let's discuss how we can architect a solution tailored to your specific challenges and help you move from proof-of-concept to production successfully.