It's My Birthday and I'm Giving YOU the Gift: Serverless MCP Servers That Cost Less Than Birthday Cake π

How serverless architecture is transforming the way AI assistants connect to your tools and data
It's my birthday! π. To celebrate, I want to teach you about creating SERVERLESS MCPS πͺ!
In lieu of a present, I'll accept connections on LinkedIn
Here's what this blog post, when deployed, will look like:

Find the complete code on GitHub.
The Problem: AI Assistants Need Better Access to Your World
Picture this: You're using Claude to help with your daily work, but it needs access to your company's internal APIs, databases, or custom tools. Traditional approaches involve:
Hard-coding integrations that break with every update
Managing always-on servers that sit idle 99% of the time
Complex authentication flows that are either insecure or user-hostile
Paying for infrastructure whether it's used or not
Enter the Model Context Protocol (MCP) - Anthropic's game-changing open standard that's revolutionizing how AI assistants interact with the digital world. Learn more at modelcontextprotocol.io
Why Serverless MCP Changes Everything
Here's the killer combination: MCP + Serverless = Magic
MCP servers are naturally request/response based - they wake up when Claude needs something, do their job, and go back to sleep. This is exactly what serverless excels at. You're not paying for idle servers waiting for the occasional request from an AI assistant. You're paying for actual usage - pennies per thousands of requests.
But here's where it gets really interesting: The serverless advantage compounds when you consider the MCP ecosystem. Imagine thousands of specialized MCP servers, each handling different tools and data sources:
Your CRM integration MCP server
Your analytics dashboard MCP server
Your code deployment MCP server
Your customer support MCP server
In a traditional architecture, you'd need infrastructure for each. With serverless? They all scale to zero when not in use. You could have 100 MCP servers and pay nothing when they're idle.
What We're Building: A Production Blueprint
This isn't another "Hello World" tutorial. We're building a production-ready, secure, scalable MCP server that demonstrates enterprise-grade patterns you can use immediately. Our example - a Dog Facts server - is intentionally simple so we can focus on the architecture that matters.
The Three-Pillar Architecture
We've designed a modular system with three distinct components, each handling a critical piece of the puzzle:
βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ
β McpAuthConstruct βββββΆβ McpLambdaConstruct βββββΆβMcpApiGatewayConstructβ
β "The Gatekeeper" β β "The Worker" β β "The Gateway" β
β β β β β β
β β’ OAuth 2.0 + PKCE β β β’ Your Logic Here β β β’ Auto-discovery β
β β’ Self-registration β β β’ Pay-per-request β β β’ RFC 9728 Support β
β β’ Enterprise SSO β β β’ Scales to zero β β β’ Custom domains β
βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββ
The Authentication Revolution: Custom Dynamic Client Registration
Here's something most developers don't know exists: Dynamic Client Registration (DCR). It's OAuth 2.0's best-kept secret and it's perfect for the MCP ecosystem.
Instead of manually configuring OAuth clients for every tool that wants to connect to your MCP server, DCR allows clients to register themselves programmatically. Claude Code (an MCP client in Anthropic's tooling) can literally say "Hi, I'd like to access this MCP server" and get its own OAuth credentials automatically.
Critical Implementation Detail: AWS Cognito doesn't natively implement RFC 7591 Dynamic Client Registration. What we're building is a custom DCR-compatible endpoint that uses API Gateway and VTL templates to call Cognito's CreateUserPoolClient API. This gives us the DCR workflow while working within AWS's authentication infrastructure, but with limitations: our endpoint is an API Gateway facade over CreateUserPoolClient, not full RFC 7591 registration semantics.
Why This Matters for MCP
Traditional OAuth flow:
Developer manually creates OAuth client in console
Copies client ID and secret
Configures application
Hopes nothing breaks
DCR-enabled MCP flow:
Claude Code discovers your MCP server
Registers itself automatically
Starts using your tools immediately
This is the difference between minutes and days of setup time.
The Implementation Secret
Here's a crucial implementation detail that took me longer than I'd like to admit to figure out:
// IMPORTANT: Must use CLASSIC_HOSTED_UI for DCR support
managedLoginVersion: ManagedLoginVersion.CLASSIC_HOSTED_UI,
AWS Cognito's newer managed login UI doesn't support dynamically registered clients because each client requires a theme configuration that doesn't get created automatically. This is a practical gotcha discovered during implementation. Use Classic Hosted UI for DCR compatibility until AWS guarantees branding defaults for dynamic clients. See Managed Login documentation for the newer option's requirements.
The Serverless Advantage in Action
Let's talk real numbers. A typical MCP server handling 1,000 requests per day:
Traditional Server (t3.micro):
Monthly cost: ~$7.50*
Always running, mostly idle
Fixed capacity
Maintenance overhead
Serverless MCP:
Monthly cost: ~$0.20**
Scales to zero
Scales automatically within account concurrency and per-function limits
Minimal ops overhead (soft limit defaults, request increases available via AWS Service Quotas)
* EC2 instance cost only, excluding storage, data transfer, load balancer (~$16-25/month)
** Lambda: 30,000 invocations Γ 100ms avg Γ $0.0000166667/GB-second at 2048MB = $0.67/month + $0.006 requests = $0.68/month (AWS Lambda Pricing)
But here's the real kicker: Most MCP servers will handle far fewer than 1,000 requests per day. Your internal tool integration might get 10 requests a week. With serverless, you're paying fractions of a penny.
More importantly... if you practice Domain Driven Development and end up having an MCP per knowledge-domain it ends up being ~$7.50 per knowledge domain. If you're an indie dev and want to incorporate MCPs into your side projects... that could end up being a lot OR you have to combine them / more carefully manage them. If you're practicing with Ephemeral CDK Stacks maybe that's not a big deal though.
Building Your First Serverless MCP Server
Let's build something real. Here's our complete Dog Facts MCP server:
export class DogFactsServer implements IMCPServer {
initialize(): MCPInitializeResult {
return {
protocolVersion: "2025-06-18", // Latest MCP specification
capabilities: { tools: {} },
serverInfo: { name: "dog-facts-server", version: "1.0.0" }
};
}
listTools(): MCPToolsListResult {
return {
tools: [{
name: "getDogFacts",
description: "Get random facts about dogs",
inputSchema: {
type: "object",
properties: {
limit: {
type: "number",
description: "Maximum facts to return (1-10)",
minimum: 1,
maximum: 10,
default: 5
}
}
}
}]
};
}
async callTool(params: MCPToolCallParams): Promise<MCPToolCallResult> {
const { name, arguments: args } = params;
if (name === "getDogFacts") {
const limit = Math.min(Math.max((args?.limit as number) || 5, 1), 10);
const response = await fetch(`https://dogapi.dog/api/v2/facts?limit=${limit}`);
const data = await response.json();
const facts = data.data.map(fact => fact.attributes.body);
return {
content: [{
type: "text",
text: facts.map((fact, i) => `${i + 1}. ${fact}`).join('\n\n')
}]
};
}
throw new Error(`Unknown tool: ${name}`);
}
}
This simple tool (getDogFacts) is just making a fetch request to an external service. Instead you could integrate this with DynamoDB queries, add a knowledge base, react-agents... you can pretty much do anything here.
This is all the code you need. The framework handles:
OAuth authentication
JSON-RPC protocol
Error handling
CORS
Scaling
Monitoring
Advanced Patterns for Production
Pattern 1: Multi-Tool Servers
Don't create separate MCP servers for related functionality or knowledge domains. Bundle them:
export class CompanyToolsServer implements IMCPServer {
listTools() {
return {
tools: [
{ name: "searchEmployees", /* ... */ },
{ name: "getRoomAvailability", /* ... */ },
{ name: "submitExpenseReport", /* ... */ },
{ name: "checkDeploymentStatus", /* ... */ }
]
};
}
}
One Lambda, multiple tools, single authentication flow.
Since these are also packaged as CDK Constructs, it would be easy to create an inner-sourced construct library with MCPs and then have them as default items in your stacks... each stack could automatically get its own MCP! To be clear... I would centralize cognito, so that you only have to manage one user pool and then you'd get access to all of the MCPs in your AWS account.
Pattern 2: Async Operations with Step Functions
For long-running operations, combine MCP with Step Functions:
async callTool(params) {
if (params.name === "generateReport") {
// Start Step Function execution
const executionArn = await startReportGeneration(params);
return {
content: [{
type: "text",
text: `Report generation started. Check status with execution ID: ${executionArn}`
}]
};
}
}
The Enterprise Story
Imagine you're a Fortune 500 company with hundreds of internal tools. Traditional approach:
Months of integration work per tool
Expensive API gateway infrastructure
Complex authentication federation
Ongoing maintenance nightmare
With serverless MCP:
Week 1: Deploy MCP framework
Week 2: First 10 tools integrated
Month 1: 50 tools available to Claude
Month 2: Entire organization using AI-powered tools
Cost for 50 MCP servers handling 100,000 requests/month total: ~$67/month*
Cost for traditional infrastructure: ~$375/month plus maintenance
* 2M Lambda invocations Γ 150ms Γ $0.0000166667/GB-second Γ 2GB = $50/month + $0.40 requests + API Gateway ~$17/month (AWS Lambda Pricing)
Security: Enterprise-Grade by Default
Our implementation includes:
OAuth 2.0 with PKCE: No client secrets, perfect for public clients. Note: Refresh tokens for public clients have shorter lifespans and require proper rotation handling
Cognito User Pools: Enterprise SSO ready
Scope-based authorization: Fine-grained access control
API Gateway throttling: Rate limiting and AWS Shield Standard reduce volumetric risk (organizations should add WAF and rate plans for comprehensive protection)
CloudWatch integration: Full audit trail
VPC options: Private connectivity when needed
Deploy Your Own in ~10 Minutes*
# Clone the repository
git clone https://github.com/martzmakes/mcp-cdk-lambda-cognito
cd mcp-cdk-lambda-cognito
# Install dependencies
npm install
# Deploy to AWS (custom domain required for .well-known endpoints)
npx cdk deploy
# Done! Your MCP server URL will be in the outputs
* Custom domain is required because Claude Code and other MCP clients use the base domain for .well-known paths, which bypasses API Gateway's stage path. ACM DNS validation adds 2-10 minutes for certificate creation.
What You Can Build: Real-World Examples
Customer Support MCP
tools: [
"searchTickets",
"updateTicketStatus",
"getCustomerHistory",
"escalateToManager"
]
Let Claude handle tier-1 support with access to your ticketing system.
DevOps MCP
tools: [
"checkDeploymentStatus",
"rollbackRelease",
"queryMetrics",
"pageOnCall"
]
Claude becomes your intelligent ops assistant.
Analytics MCP
tools: [
"runSQLQuery",
"generateReport",
"exportDashboard",
"scheduleAlert"
]
Natural language to insights, instantly.
The Future is Serverless MCP
We're in the midst of a revolution. With major platforms like OpenAI, Google DeepMind, and Microsoft adopting MCP in 2025, the serverless advantage becomes overwhelming:
Ecosystem explosion: Thousands of MCP servers, all scaling independently
Universal adoption: OpenAI adoption March 26, 2025, Google DeepMind April 2025, Microsoft Copilot Studio GA May 29, 2025
Cost approaching zero: Pay only for actual AI assistance
Instant integration: Tools become AI-ready in minutes, not months
Standardization wins: The MCP specification 2025-06-18 is now the widely adopted open standard
Conclusion: Why This Matters
MCP has fundamentally changed how we think about AI integration. With 2025's universal adoption across all major AI platforms, serverless makes it economically viable at any scale. Together, they're democratizing AI-powered automation.
The architecture we've built here isn't just another example - it's a production blueprint you can use today to start building the AI-integrated future. Whether you're a startup looking to give Claude access to your tools or an enterprise wanting to modernize your AI strategy, serverless MCP is your answer.
With OpenAI, Google, and Microsoft all standardizing on MCP, the question isn't whether to build MCP servers. It's how many you'll build this month.
Next Steps
Deploy the example: Get hands-on with the code
Build your first custom server: Start with one internal tool
Share with the community: The MCP ecosystem grows with every contribution
Ready to dive deeper into serverless patterns? Check out my post on securing serverless apps with Cognito's managed login pages.
Find the complete code on GitHub.
Technical Deep Dive: Implementation Details
The Three-Construct Architecture: Deep Dive
Our modular approach separates concerns into three specialized constructs, each with a clean interface and specific responsibility. Here's how intermediate CDK developers can leverage this pattern:
McpAuthConstruct: The OAuth Foundation
Full implementation: lib/constructs/mcp-auth-construct.ts
export interface McpAuthConstructProps {
serverName: string; // Only required input
}
export interface McpAuthResult {
userPool: UserPool;
resourceServer: UserPoolResourceServer;
oauthScopes: OAuthScope[];
clientId: string;
authUrl: string;
tokenUrl: string;
oauthScope: string;
signInUrl: string;
}
The Critical DCR Implementation Detail: Here's the configuration that enables Dynamic Client Registration:
// IMPORTANT: Must use CLASSIC_HOSTED_UI for DCR support
const userPoolDomain = userPool.addDomain("McpAuthUserPoolDomain", {
cognitoDomain: {
domainPrefix: `mcp-${serverName}-${domainHash}`,
},
managedLoginVersion: ManagedLoginVersion.CLASSIC_HOSTED_UI, // π KEY!
});
Source: mcp-auth-construct.ts:65-67
Why this matters: AWS Cognito's newer managed login UI doesn't support dynamically registered clients because each client requires a theme configuration that doesn't get created automatically. This single line is the difference between DCR working and spending hours debugging OAuth flows.
Domain Collision Prevention: Notice the domainHash pattern:
const domainHash = this.node.addr.substring(0, 8);
const domainPrefix = `mcp-${serverName}-${domainHash}`;
Source: mcp-auth-construct.ts:60-63
This uses CDK's internal node addressing to create unique domain prefixes, preventing conflicts when multiple developers deploy the same stack.
McpLambdaConstruct: Optimized for Performance
Full implementation: lib/constructs/mcp-lambda-construct.ts
this.lambdaFunction = new NodejsFunction(this, "function", {
entry: path.join(__dirname, "../lambda/mcp.ts"),
functionName: `mcp-server-${serverName}`,
memorySize: 2048, // Sweet spot for CPU allocation
timeout: Duration.seconds(29), // Just under API Gateway's 30s limit*
architecture: Architecture.ARM_64, // 20% better price/performance
runtime: Runtime.NODEJS_22_X, // Latest runtime for better cold starts
environment: {
LOG_LEVEL: logLevel,
},
bundling: {
format: OutputFormat.ESM, // Modern modules, better tree-shaking
mainFields: ["module", "main"], // Prioritize ES modules
},
});
Source: mcp-lambda-construct.ts:38-53
Performance Optimization Notes:
ARM64: Graviton2/3 processors offer significantly better price/performance
2048MB: This memory allocation provides optimal CPU-to-memory ratio for most workloads
ESM Format: Modern bundling reduces cold start times and bundle size
29-second timeout: Keeps us safely under API Gateway's 30-second limit*
* Since mid-2024, some REST API timeouts can be increased via quota requests, but 29s remains the safe default for user-facing requests
McpApiGatewayConstruct: The Complex Integration Layer
Full implementation: lib/constructs/mcp-api-gateway-construct.ts
This is where the CDK complexity really shows its value. Here's the DCR endpoint implementation:
private createDcrEndpoint(api: RestApi, userPool: UserPool, serverName: string) {
// Create IAM role for API Gateway to call Cognito
const cognitoIntegrationRole = new Role(this, "CognitoIntegrationRole", {
assumedBy: new ServicePrincipal("apigateway.amazonaws.com"),
inlinePolicies: {
CognitoAccess: new PolicyDocument({
statements: [
new PolicyStatement({
effect: Effect.ALLOW,
actions: ["cognito-idp:CreateUserPoolClient"],
resources: [userPool.userPoolArn],
}),
],
}),
},
});
// AWS Integration directly with Cognito API
// Note: This is a custom DCR implementation, not native RFC 7591 support
const dcrIntegration = new AwsIntegration({
service: "cognito-idp",
action: "CreateUserPoolClient",
options: {
credentialsRole: cognitoIntegrationRole,
requestTemplates: {
"application/json": `#set($rawName = $input.path('$.client_name'))
#if(!$rawName || $rawName == "")
#set($rawName = "client")
#end
#set($name1 = $rawName.trim())
#set($name2 = $name1.replaceAll("[^\\\\w\\\\s+=,.@-]", ""))
#set($safeName = $util.escapeJavaScript($name2))
#if($safeName.length() > 128)
#set($safeName = $safeName.substring(0,128))
#end
#set($cb = $input.json('$.redirect_uris'))
#if(!$cb) #set($cb = '[]') #end
{
"UserPoolId": "${userPool.userPoolId}",
"ClientName": "$safeName",
"CallbackURLs": $cb,
"AllowedOAuthFlows": ["code"],
"AllowedOAuthFlowsUserPoolClient": true,
"AllowedOAuthScopes": ["mcp-${serverName}/${serverName}", "openid", "email", "profile"],
"SupportedIdentityProviders": ["COGNITO"],
"GenerateSecret": false // π This enables PKCE!
}`
}
}
});
}
Source: mcp-api-gateway-construct.ts:315-398
VTL Template Deep Dive: This Velocity Template Language (VTL) template is doing critical work:
Input Sanitization: Removes dangerous characters from client names
Length Validation: Ensures client names don't exceed Cognito's 128-char limit
PKCE Configuration:
"GenerateSecret": falseis what makes PKCE workScope Assignment: Automatically assigns the correct MCP scope
The Response Transformation:
responseTemplates: {
"application/json": `{
"client_id": $input.json('$.UserPoolClient.ClientId'),
"client_name": $input.json('$.UserPoolClient.ClientName'),
"redirect_uris": $input.json('$.UserPoolClient.CallbackURLs'),
"response_types": ["code"],
"grant_types": ["authorization_code"],
"token_endpoint_auth_method": "none", // PKCE indicator
"scope": "mcp-${serverName}/${serverName} openid email profile"
}`
}
This transforms Cognito's API response into RFC 7591-compliant DCR response format.
OAuth Metadata Endpoints: RFC Compliance Made Easy
OAuth Protected Resource Metadata (RFC 9728, published April 2025):
const metadataIntegration = new MockIntegration({
integrationResponses: [{
statusCode: "200",
responseTemplates: {
"application/json": JSON.stringify({
resource_name: `${serverName} MCP Server`,
resource: finalApiUrl,
authorization_servers: [
`https://${customDomain.customDomainName}/.well-known/oauth-authorization-server`
],
scopes_supported: [`mcp-${serverName}/${serverName}`],
bearer_methods_supported: ["header"],
}, null, 2)
}
}]
});
RFC 8414 Authorization Server Metadata:
const authServerMetadataIntegration = new MockIntegration({
integrationResponses: [{
responseTemplates: {
"application/json": JSON.stringify({
issuer: oauthConfig.authUrl.split("/oauth2/authorize")[0],
authorization_endpoint: oauthConfig.authUrl,
token_endpoint: oauthConfig.tokenUrl,
registration_endpoint: `https://${customDomain.customDomainName}/connect/register`,
response_types_supported: ["code"],
grant_types_supported: ["authorization_code", "client_credentials"],
code_challenge_methods_supported: ["S256"],
token_endpoint_auth_methods_supported: ["client_secret_post", "client_secret_basic", "none"]
}, null, 2)
}
}]
});
Using MockIntegration here is a CDK pattern that lets us return static JSON without invoking Lambda, keeping costs near zero for metadata requests.
Critical Architecture Note: Custom domains are required for MCP servers because clients like Claude Code make .well-known requests to the base domain (e.g., https://example.com/.well-known/oauth-protected-resource), which bypasses API Gateway's stage path entirely. Without a custom domain, these discovery endpoints would return 404s.
Stack Orchestration: The 78-Line Marvel
Full implementation: lib/mcp-cdk-lambda-cognito-stack.ts
export class McpCdkLambdaCognitoStack extends cdk.Stack {
constructor(scope: Construct, id: string, props: McpCdkLambdaCognitoProps) {
super(scope, id, props);
const serverName = "dog-facts";
const customDomainName = `mcp-dogfacts.martzmakes.com`;
// Certificate and DNS setup
const hostedZone = HostedZone.fromLookup(this, "HostedZone", {
domainName: "martzmakes.com",
});
const certificate = new Certificate(this, "Certificate", {
domainName: customDomainName,
validation: CertificateValidation.fromDns(hostedZone),
});
// Three-construct composition
const authConstruct = new McpAuthConstruct(this, "Auth", { serverName });
const lambdaConstruct = new McpLambdaConstruct(this, "Lambda", { serverName });
const apiGatewayConstruct = new McpApiGatewayConstruct(this, "ApiGateway", {
serverName,
lambdaFunction: lambdaConstruct.lambdaFunction,
userPool: authConstruct.result.userPool,
resourceServer: authConstruct.result.resourceServer,
oauthScopes: authConstruct.result.oauthScopes,
oauthConfig: {
clientId: authConstruct.result.clientId,
authUrl: authConstruct.result.authUrl,
tokenUrl: authConstruct.result.tokenUrl,
scope: authConstruct.result.oauthScope,
},
customDomain: { customDomainName, certificate, hostedZone },
});
}
}
Pattern Highlights for CDK Users:
Construct Dependency Flow: Auth β Lambda β API Gateway, with clean interfaces
Custom Domain Integration: ACM certificate with DNS validation
Result Object Pattern: Each construct exposes a clean
resultinterfaceResource Naming: Consistent
serverName-based naming throughout
Advanced CDK Patterns in Action
Gateway Responses for OAuth Compliance:
new GatewayResponse(this, "UnauthorizedResponse", {
restApi: api,
type: ResponseType.UNAUTHORIZED,
statusCode: "401",
responseHeaders: {
"WWW-Authenticate": "'Bearer realm=\"MCP Server\", error=\"invalid_request\"'",
},
});
This ensures proper HTTP error responses that comply with OAuth 2.0 Bearer Token specification.
CORS Configuration for MCP Clients:
defaultCorsPreflightOptions: {
allowOrigins: Cors.ALL_ORIGINS,
allowMethods: Cors.ALL_METHODS,
allowHeaders: Cors.DEFAULT_HEADERS.concat(["Authorization"]),
}
Essential for browser-based MCP clients that need to make cross-origin requests.
The Type System That Saves You
Our construct interfaces enforce compile-time correctness:
export interface McpApiGatewayConstructProps {
serverName: string;
lambdaFunction: NodejsFunction;
userPool: UserPool;
resourceServer: UserPoolResourceServer;
oauthScopes: OAuthScope[];
oauthConfig: {
clientId: string;
authUrl: string;
tokenUrl: string;
scope: string;
};
customDomain: {
customDomainName: string;
certificate: Certificate;
hostedZone: IHostedZone;
};
}
You literally cannot wire constructs incorrectly. TypeScript prevents entire categories of deployment failures.

(dog tax from my late pup Astro)
Let's build the future of AI integration together. One serverless function at a time.



