Skip to main content
Middleware lets you intercept MCP requests to add logging, authentication, rate limiting, validation, or any cross-cutting logic.

Quick Start

from mcp_use.server import MCPServer
from mcp_use.server.middleware import Middleware

class LoggingMiddleware(Middleware):
    async def on_request(self, context, call_next):
        print(f"→ {context.method}")
        result = await call_next(context)
        print(f"← {context.method}")
        return result

server = MCPServer(
    name="my-server",
    middleware=[LoggingMiddleware()]
)

How It Works

Middleware executes in an onion model: each middleware wraps the next, with the handler at the center.
Request
  → Middleware A (before)
    → Middleware B (before)
      → Handler
    ← Middleware B (after)
  ← Middleware A (after)
Response
Each middleware can:
  • Inspect/modify the request before calling call_next
  • Inspect/modify the response after call_next returns
  • Short-circuit by returning early without calling call_next
  • Reject by raising an exception

Hooks

Override these methods to intercept specific request types:
HookWhen it runs
on_requestEvery request (wraps all other hooks)
on_initializeClient connection handshake
on_call_toolTool execution
on_read_resourceResource reads
on_get_promptPrompt retrieval
on_list_toolsTool listing
on_list_resourcesResource listing
on_list_promptsPrompt listing
on_set_logging_levelClient sets minimum log level
on_completeCompletion/autocomplete requests
Typed context: Each hook receives a fully-typed context.message. For example, on_initialize gets ServerMiddlewareContext[InitializeRequestParams], so your editor knows exactly what fields are available (like context.message.clientInfo.name). No guessing, full autocomplete.

Hook nesting

When you override both on_request and a specific hook, they nest: on_request wraps the specific hook.
class MyMiddleware(Middleware):
    async def on_request(self, context, call_next):
        print("1. on_request before")
        result = await call_next(context)  # Calls on_call_tool (if tool request)
        print("4. on_request after")
        return result

    async def on_call_tool(self, context, call_next):
        print("2. on_call_tool before")
        result = await call_next(context)  # Calls handler
        print("3. on_call_tool after")
        return result
Use on_request for logic that applies to all requests. Use specific hooks when you only care about certain operations.

Context

Every hook receives a ServerMiddlewareContext with:
FieldTypeDescription
messageTyped paramsRequest parameters (e.g., CallToolRequestParams)
methodstrMCP method name (e.g., "tools/call")
session_idstr | NoneClient session ID (from mcp-session-id header)
transportstrTransport type ("stdio", "streamable-http")
timestampdatetimeRequest timestamp
headersdict | NoneHTTP headers (HTTP transports only)
client_ipstr | NoneClient IP (HTTP transports only)
metadatadictCustom data passed between middleware
Context is immutable. Use context.copy() to pass data downstream:
enriched = context.copy(metadata={**context.metadata, "user_id": "123"})
return await call_next(enriched)

Examples

Reject requests without a valid API key:
class AuthMiddleware(Middleware):
    async def on_call_tool(self, context, call_next):
        api_key = context.headers.get("x-api-key") if context.headers else None
        if not api_key or api_key != "secret":
            raise PermissionError("Invalid API key")
        return await call_next(context)

Middleware Order

Order matters. Middleware runs in the order added, with earlier middleware wrapping later ones.
server = MCPServer(
    middleware=[
        LoggingMiddleware(),      # 1. Outermost - sees all requests
        AuthMiddleware(),         # 2. Rejects unauthorized early
        RateLimitMiddleware(),    # 3. Limits request rate
        ValidationMiddleware(),   # 4. Innermost - validates data
    ]
)
Recommended order: Logging → Authentication → Rate limiting → Validation. This ensures logging sees all requests (including rejected ones) and auth rejects early before expensive operations.

Best Practices

  • Single responsibility: Each middleware does one thing
  • Fail fast: Reject invalid requests early, before expensive operations
  • Always call call_next: Unless intentionally short-circuiting
  • Re-raise exceptions: If you catch errors to log them, always re-raise
async def on_request(self, context, call_next):
    try:
        return await call_next(context)
    except Exception as e:
        print(f"Request failed: {e}")
        raise  # Always re-raise

Full Example

middleware_example.py

Complete working server with logging, auth, rate limiting, and validation middleware.