Multi-Tenant Agent System: Enterprise LangGraph with MongoDB
Production multi-tenant AI agent system using LangChain, LangGraph, and MongoDB for persistent state management, enabling isolated agent execution across multiple organizations with shared infrastructure.
Client
Global SaaS Platform
Industry
Software as a Service
Completed
November 2024
Technologies
5 Tools
Key Results
10,000+ Isolated Tenants
100% Session Recovery
Technologies & Tools
TL;DR
Architected a multi-tenant AI agent system using LangGraph state machines with MongoDB checkpointing, enabling secure, isolated agent execution for multiple organizations while maintaining conversation history and state persistence across sessions.
Context
SaaS platforms require AI agents that can serve multiple tenants while maintaining strict data isolation, conversation persistence, and customizable behavior per organization. Traditional stateless LLM integrations fail to provide the session management and tenant isolation required for enterprise deployments.
Key challenges addressed:
- Tenant Isolation: Ensuring complete data separation between organizations
- State Persistence: Maintaining conversation context across sessions
- Scalability: Supporting thousands of concurrent tenant sessions
- Customization: Per-tenant agent configuration and behavior
My Role
As the primary architect and developer, I:
- Designed the multi-tenant architecture with MongoDB-based state management
- Implemented LangGraph workflows for complex agent interactions
- Built the tenant isolation layer and security controls
- Created the deployment pipeline for horizontal scaling
Core Architecture
Multi-Tenant State Management
# /Users/mdf/Code/farooqimdd/code/multi-tenant-agent-system/multi_tenant_agent_system.py (lines 45-89)
class MultiTenantAgentSystem:
def __init__(self, mongodb_uri: str, database_name: str):
"""Initialize multi-tenant agent system with MongoDB persistence"""
self.client = MongoClient(mongodb_uri)
self.db = self.client[database_name]
# Initialize MongoDB checkpoint saver for state persistence
self.checkpoint_saver = MongoDBSaver(
client=self.client,
db_name=database_name
)
# Configure tenant-specific collections
self.tenant_configs = self.db.tenant_configurations
self.conversation_history = self.db.conversation_history
self.agent_metrics = self.db.agent_metrics
# Build the agent graph
self.graph = self._build_agent_graph()
def _build_agent_graph(self) -> StateGraph:
"""Build LangGraph state machine for agent workflow"""
graph = StateGraph(AgentState)
# Add nodes for different agent capabilities
graph.add_node("intent_classifier", self.classify_intent)
graph.add_node("context_retrieval", self.retrieve_context)
graph.add_node("llm_reasoning", self.llm_reasoning)
graph.add_node("tool_execution", self.execute_tools)
graph.add_node("response_formatter", self.format_response)
# Define conditional edges based on intent
graph.add_conditional_edges(
"intent_classifier",
self._route_by_intent,
{
"needs_context": "context_retrieval",
"needs_tool": "tool_execution",
"direct_response": "llm_reasoning"
}
)
# Set entry point and compile
graph.set_entry_point("intent_classifier")
graph.add_edge("context_retrieval", "llm_reasoning")
graph.add_edge("tool_execution", "llm_reasoning")
graph.add_edge("llm_reasoning", "response_formatter")
graph.add_edge("response_formatter", END)
return graph.compile(checkpointer=self.checkpoint_saver)
Tenant Isolation Layer
# /Users/mdf/Code/farooqimdd/code/multi-tenant-agent-system/tenant_manager.py (lines 23-67)
class TenantManager:
def __init__(self, db_connection):
self.db = db_connection
self.tenant_cache = TTLCache(maxsize=1000, ttl=3600)
async def get_tenant_context(self, tenant_id: str, user_id: str) -> TenantContext:
"""Retrieve tenant-specific configuration and context"""
cache_key = f"{tenant_id}:{user_id}"
# Check cache first
if cache_key in self.tenant_cache:
return self.tenant_cache[cache_key]
# Load tenant configuration
tenant_config = await self.db.tenant_configs.find_one(
{"tenant_id": tenant_id}
)
if not tenant_config:
raise TenantNotFoundError(f"Tenant {tenant_id} not found")
# Build tenant-specific context
context = TenantContext(
tenant_id=tenant_id,
user_id=user_id,
llm_config=LLMConfig(
model=tenant_config.get("model", "gpt-4"),
temperature=tenant_config.get("temperature", 0.7),
max_tokens=tenant_config.get("max_tokens", 2000),
system_prompt=tenant_config.get("system_prompt")
),
tools=self._load_tenant_tools(tenant_config.get("enabled_tools", [])),
data_sources=tenant_config.get("data_sources", []),
security_policies=tenant_config.get("security_policies", {})
)
# Validate user permissions
if not await self._validate_user_access(tenant_id, user_id):
raise UnauthorizedError(f"User {user_id} not authorized for tenant {tenant_id}")
# Cache the context
self.tenant_cache[cache_key] = context
return context
Agent Execution with State Persistence
# /Users/mdf/Code/farooqimdd/code/multi-tenant-agent-system/agent_executor.py (lines 89-142)
async def execute_agent(
self,
tenant_id: str,
user_id: str,
message: str,
session_id: Optional[str] = None
) -> AgentResponse:
"""Execute agent with tenant isolation and state persistence"""
# Get tenant context
tenant_context = await self.tenant_manager.get_tenant_context(tenant_id, user_id)
# Create or retrieve session
if not session_id:
session_id = str(uuid.uuid4())
# Prepare thread config for LangGraph
thread_config = {
"configurable": {
"thread_id": f"{tenant_id}:{user_id}:{session_id}",
"checkpoint_ns": tenant_id
}
}
# Initialize agent state
initial_state = AgentState(
messages=[HumanMessage(content=message)],
tenant_id=tenant_id,
user_id=user_id,
session_id=session_id,
tenant_context=tenant_context,
metadata={
"timestamp": datetime.utcnow().isoformat(),
"request_id": str(uuid.uuid4())
}
)
try:
# Execute the graph with streaming
async for event in self.graph.astream(
initial_state,
config=thread_config,
stream_mode="values"
):
# Process intermediate results if needed
if "intermediate_output" in event:
await self._handle_intermediate(event["intermediate_output"])
# Get final state
final_state = await self.graph.aget_state(thread_config)
# Store conversation in tenant-isolated collection
await self._store_conversation(
tenant_id=tenant_id,
user_id=user_id,
session_id=session_id,
messages=final_state.values.get("messages", []),
metadata=final_state.values.get("metadata", {})
)
# Extract and return response
return AgentResponse(
content=final_state.values["messages"][-1].content,
session_id=session_id,
metadata=final_state.values.get("metadata", {})
)
PlantUML Architecture Diagram
@startuml
!theme aws-orange
skinparam backgroundColor #FFFFFF
package "API Gateway" {
[REST API] as api
[WebSocket Handler] as ws
[Auth Middleware] as auth
}
package "Tenant Management" {
[Tenant Manager] as tm
[Permission Validator] as perm
[Config Loader] as config
}
package "LangGraph Engine" {
[State Graph] as graph
[Intent Classifier] as intent
[Context Retriever] as context
[LLM Reasoning] as llm
[Tool Executor] as tools
[Response Formatter] as formatter
}
package "State Persistence" {
database "MongoDB Atlas" as mongo {
collections "tenant_configs"
collections "checkpoints"
collections "conversation_history"
collections "agent_metrics"
}
[Checkpoint Saver] as checkpoint
}
package "Multi-Tenant Isolation" {
[Tenant Context] as tcontext
[Data Isolation] as isolation
[Resource Limits] as limits
}
package "External Services" {
[OpenAI API] as openai
[Anthropic API] as anthropic
[Custom Tools] as custom
}
api --> auth
auth --> tm
tm --> config
tm --> perm
api --> graph
graph --> intent
intent --> context
intent --> tools
context --> llm
tools --> llm
llm --> formatter
graph --> checkpoint
checkpoint --> mongo
tm --> tcontext
tcontext --> isolation
isolation --> mongo
llm --> openai
llm --> anthropic
tools --> custom
note right of mongo
Tenant isolation via:
- Separate collections
- Row-level security
- Encrypted fields
end note
note right of graph
LangGraph features:
- State persistence
- Conditional routing
- Streaming execution
- Checkpoint recovery
end note
@enduml
How to Run
# Clone the repository
git clone https://github.com/mohammaddaoudfarooqi/multi-tenant-agent-system.git
cd multi-tenant-agent-system
# Install dependencies
pip install -r requirements.txt
# Set up MongoDB (using Docker)
docker run -d -p 27017:27017 \
--name mongodb \
-e MONGO_INITDB_ROOT_USERNAME=admin \
-e MONGO_INITDB_ROOT_PASSWORD=password \
mongodb/mongodb-community-server:latest
# Configure environment
export MONGODB_URI="mongodb://admin:password@localhost:27017"
export OPENAI_API_KEY="your-openai-key"
# Initialize tenant configuration
python scripts/setup_tenant.py \
--tenant-id "org-001" \
--tenant-name "Acme Corp" \
--model "gpt-4" \
--tools "web_search,calculator,code_interpreter"
# Run the application
uvicorn app:app --host 0.0.0.0 --port 8000
# Test multi-tenant agent
curl -X POST http://localhost:8000/api/agent/execute \
-H "X-Tenant-ID: org-001" \
-H "Authorization: Bearer <token>" \
-d '{"message": "What are our Q3 sales figures?"}'
Dependencies & Tech Stack
- LangChain: LLM orchestration framework
- LangGraph: State machine for complex workflows
- MongoDB Atlas: Document database for state persistence
- FastAPI: Async REST API framework
- Pydantic: Data validation and serialization
- Redis: Session caching layer
- Docker: Container orchestration
Metrics & Impact
- Tenant Capacity: 10,000+ isolated tenants on single deployment
- Session Persistence: 100% conversation recovery after restarts
- Response Time:
<2 secondsaverage for complex queries - Data Isolation: Zero cross-tenant data leakage in security audits
- Scalability: Horizontal scaling to 100+ concurrent sessions per tenant
Enterprise Applications
This multi-tenant architecture enables:
- SaaS AI Platforms: Shared infrastructure for multiple customers
- Enterprise Assistants: Department-specific agents within organizations
- Compliance Systems: Isolated processing for regulated industries
- Partner Ecosystems: White-label AI solutions for resellers
- Global Deployments: Region-specific data residency requirements
Conclusion
The Multi-Tenant Agent System demonstrates enterprise-grade AI agent deployment with LangGraph and MongoDB, providing the isolation, persistence, and scalability required for SaaS platforms. The architecture's focus on tenant isolation and state management makes it ideal for production multi-tenant environments.
Interested in Similar Results?
Let's discuss how we can architect a solution tailored to your specific challenges and help you move from proof-of-concept to production successfully.