Module 5: API Design & Gateway Patterns

🚀 Problem Statement

A GenAI platform exposes 47 microservice endpoints. The mobile app makes 12 API calls to render a single enterprise document view. Each call has its own auth check, rate limit logic, and error format. Adding a new client (Teams bot, email integration) requires coordinating with 8 teams.

🧠 The Engineering Story

The Villain: "The Spaghetti Integration." Every client talks directly to every microservice. Each service implements its own auth, rate limiting, and versioning. A change in the embedding service breaks the mobile app.

The Hero: "The API Gateway." A single entry point that handles cross-cutting concerns and composes backend calls into client-optimized responses.

The Plot:

Design RESTful APIs with proper resource modeling (not RPC-over-REST)
Implement an API Gateway for auth, rate limiting, request aggregation
Use BFF (Backend-for-Frontend) pattern for mobile vs web vs bot clients
Design streaming APIs for LLM token output (SSE vs WebSocket)
Version APIs without breaking existing clients

The Twist (Failure): The God Gateway. Placing too much logic in the gateway can turn it into a single point of failure with significant latency overhead. It may become harder to deploy than the microservices themselves.

Interview Signal: Can articulate the difference between API Gateway, BFF, and Service Mesh — and when each is appropriate.

🧠 Key Patterns

Pattern	Use Case	Context
API Gateway	Cross-cutting concerns	Auth, rate-limit, logging for all services
BFF	Client-specific aggregation	Mobile gets minimal payload, web gets full document view
SSE (Server-Sent Events)	Uni-directional streaming	LLM token streaming to browser
GraphQL	Flexible client queries	Dashboard with customizable document report views
Idempotency Keys	Safe retries	Prevent duplicate document revisions on network retry

🔗 Case Study References

Rate Limiter Architecture — For deep dive on traffic control and distributed state.