Architecture Guide#
This document describes the system architecture and design principles of Cloud Native MCP Server.
Table of Contents#
- Overview
- System Architecture
- Core Components
- Service Integration
- Data Flow
- Design Principles
- Performance Optimization
- Scalability
Overview#
Cloud Native MCP Server is a high-performance Model Context Protocol (MCP) server for managing Kubernetes and cloud-native infrastructure. It adopts a modular design with support for multiple runtime modes and protocols.
Architecture Goals#
- High Performance: Optimized caching, connection pooling, and resource management
- Scalability: Modular design, easy to add new services
- Security: Multi-layer authentication, input sanitization, and audit logging
- Observability: Built-in metrics, logging, and tracing
- Reliability: Health checks, retry mechanisms, and graceful degradation
System Architecture#
┌─────────────────────────────────────────────────────────────┐
│ Client │
│ (Claude Desktop, Browser, Custom MCP Clients) │
└────────────────────┬────────────────────────────────────────┘
│
│ MCP Protocol (SSE/Streamable-HTTP)
│
┌────────────────────▼────────────────────────────────────────┐
│ HTTP Server │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Routing Layer (SSE/Streamable-HTTP) │ │
│ └────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Middleware Layer │ │
│ │ - Authentication (API Key/Bearer/Basic) │ │
│ │ - Audit Logging │ │
│ │ - Rate Limiting │ │
│ │ - Security Middleware │ │
│ │ - Metrics Collection │ │
│ └────────────────────────────────────────────────────────┘ │
└────────────────────┬────────────────────────────────────────┘
│
│
┌────────────────────▼────────────────────────────────────────┐
│ Service Management Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │Kubernetes│ │ Helm │ │ Grafana │ │Prometheus│ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Kibana │ │Elastic │ │ AlertMgr │ │ Jaeger │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Otel │ │Utilities │ │
│ └──────────┘ └──────────┘ │
└────────────────────┬────────────────────────────────────────┘
│
│
┌────────────────────▼────────────────────────────────────────┐
│ Infrastructure Layer │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Cache Layer (LRU/Segmented) │ │
│ └────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Secret Management │ │
│ └────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Logging System │ │
│ └────────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Metrics System │ │
│ └────────────────────────────────────────────────────────┘ │
└────────────────────┬────────────────────────────────────────┘
│
│
┌────────────────────▼────────────────────────────────────────┐
│ External Services │
│ Kubernetes Cluster, Grafana, Prometheus, ES, etc. │
└─────────────────────────────────────────────────────────────┘Core Components#
1. HTTP Server#
Responsibility: Handle incoming HTTP/SSE requests and connections
Features:
- Support for multiple runtime modes (SSE, Streamable-HTTP)
- Configurable timeouts and connection limits
- Graceful shutdown
- Health check endpoints
Key Files:
cmd/server/server.gointernal/middleware/
2. Routing Layer#
Responsibility: Route requests to the correct services and tools
Features:
- Dynamic routing registration
- Path parameter parsing
- Query parameter validation
- Error handling
Key Files:
internal/services/registry.go
3. Middleware Layer#
Responsibility: Execute common logic before and after request processing
Middlewares:
- Authentication: API Key, Bearer Token, Basic Auth
- Audit Logging: Record all operations
- Rate Limiting: Prevent abuse
- Security: Input sanitization and validation
- Metrics: Collect performance metrics
Key Files:
internal/middleware/auth_middleware.gointernal/middleware/audit_middleware.gointernal/middleware/ratelimit.gointernal/middleware/security_middleware.gointernal/middleware/metrics_middleware.go
4. Service Manager#
Responsibility: Manage all registered services and tools
Features:
- Service registration and discovery
- Tool call routing
- Service lifecycle management
- Health check coordination
Key Files:
internal/services/manager/manager.go
5. Cache Layer#
Responsibility: Provide high-performance caching to reduce external service calls
Features:
- LRU cache
- Segmented cache
- TTL support
- Cache statistics
Key Files:
internal/services/cache/
6. Secret Manager#
Responsibility: Securely store and manage sensitive credentials
Features:
- In-memory storage
- Key rotation
- Key generation
- Expiration management
Key Files:
internal/secrets/manager.go
7. Logging System#
Responsibility: Structured logging
Features:
- Multiple log levels (debug, info, warn, error)
- JSON and text formats
- Structured fields
- Context support
Key Files:
internal/logging/logging.go
8. Metrics System#
Responsibility: Collect and expose performance metrics
Features:
- Prometheus format
- Request counts
- Latency statistics
- Cache hit rates
Key Files:
internal/observability/metrics/
Service Integration#
Service Interface#
All services implement a unified interface:
| |
Service Registration#
Services are automatically registered at startup:
| |
Tool Call Flow#
- Client sends tool call request
- Routing layer parses request, determines service and tool
- Middleware layer executes authentication, audit, etc.
- Service manager routes to correct service
- Cache layer checks cache
- Service executes tool call
- Result returned to client
- Audit log records operation
Data Flow#
Request Flow#
Client
│
├─> HTTP/SSE Connection
│
├─> Authentication Middleware
│ ├─> Validate API Key/Token
│ └─> Check permissions
│
├─> Rate Limiting Middleware
│ └─> Check quota
│
├─> Routing Layer
│ └─> Parse service and method
│
├─> Audit Middleware
│ └─> Record request start
│
├─> Service Manager
│ └─> Route to service
│
├─> Cache Layer
│ ├─> Check cache
│ └─> Return cache or continue
│
├─> Service
│ ├─> Call external API
│ ├─> Process response
│ └─> Update cache
│
├─> Audit Middleware
│ └─> Record request completion
│
├─> Metrics Middleware
│ └─> Record metrics
│
└─> Response returned to clientResponse Flow#
Service
│
├─> Process result
│
├─> Data Transformation
│ ├─> Formatting
│ └─> Compression
│
├─> Cache Update
│ └─> Store in cache
│
├─> Metrics Update
│ └─> Record performance metrics
│
└─> Return responseDesign Principles#
1. Modularity#
Each service is an independent module that can be enabled/disabled individually:
| |
2. Scalability#
Easy to add new services:
- Create service directory
- Implement service interface
- Register tools
- Configure options
3. Configuration Driven#
All behavior is controlled through configuration:
- Service enable/disable
- Authentication method
- Cache strategy
- Log level
4. Fault Isolation#
Service failures don’t affect other services:
| |
5. Graceful Degradation#
Return friendly errors when services are unavailable:
| |
Performance Optimization#
1. Caching Strategy#
LRU Cache#
| |
Use Cases:
- Read-intensive operations
- Infrequently changing data
- High latency operations
Segmented Cache#
| |
Use Cases:
- Different types of data
- Need different TTLs
- Concurrent access
2. Connection Pooling#
| |
3. Response Path Optimizations#
Response compression/truncation and JSON pipeline optimizations are handled internally.
Use supported public knobs (server, kubernetes, ratelimit) for performance tuning.
4. JSON Encoding Pool#
| |
5. Batching#
| |
Scalability#
Adding a New Service#
- Create Service Directory
| |
- Implement Service Interface
| |
- Register Service
| |
- Add Configuration
| |
Custom Tools#
| |
Observability#
Metrics#
Request Metrics#
| |
Cache Metrics#
| |
Connection Metrics#
| |
Logging#
Structured Logging#
| |
Tracing#
OpenTelemetry Integration#
| |
Deployment Architecture#
Single Node Deployment#
┌─────────────────┐
│ MCP Server │
│ (All Services) │
└────────┬────────┘
│
├─> Kubernetes
├─> Grafana
├─> Prometheus
└─> ...Multi-Node Deployment#
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ MCP Node 1 │ │ MCP Node 2 │ │ MCP Node 3 │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└─────────────────┴─────────────────┘
│
▼
┌──────────────────┐
│ Load Balancer │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ External Services│
└──────────────────┘Microservices Deployment#
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ MCP Gateway │ │ MCP Service │ │ MCP Service │
│ (Router) │ │ (Kubernetes) │ │ (Grafana) │
└──────┬───────┘ └──────────────┘ └──────────────┘
│
▼
┌──────────────────┐
│ Service Mesh │
│ (mTLS, Routing) │
└──────────────────┘