Domain / Business
Infrastructure
External Services
Events / Async
Security / Auth
Data / Storage
Route Configuration (9 Services)
Spring Cloud Gateway routes — path-based routing with K8s service discovery
Path Prefix Service Auth Level Rate Limit
/api/v1/auth/** byld-identity Public 10/min IP
/api/v1/users/** byld-identity JWT 60/min user
/api/v1/chat/** byld-mia JWT 30/min user
/api/v1/advisory/** byld-advisory JWT + Tier 60/min user
/api/v1/portfolio/** byld-portfolio JWT 120/min user
/api/v1/orders/** byld-distribution JWT + KYC 30/min user
/api/v1/markets/** byld-markets JWT 200/min user
/api/v1/payments/** byld-payments JWT 20/min user
/api/v1/estate/** byld-estate JWT + Tier 30/min user
* /api/v1/notifications/** handled internally (event-driven, not routed through gateway)
* Webhooks (/webhooks/razorpay, /webhooks/willjini) have separate IP-whitelist routes
* SSE streams (/api/v1/markets/stream/**) bypass rate limiter, use connection limit instead
Gateway Config
Framework: Spring Cloud Gateway 4.x
Service Discovery: K8s DNS
Load Balancer: Round-Robin
Strip Prefix: 0 (paths preserved)
Global CORS: allowed origins configurable
Rate Limiter
Algorithm: Token Bucket (Redis)
Key: {clientId} or {IP} for public
Headers: X-RateLimit-Remaining
Burst: 2x configured limit
429 response with Retry-After
Request Lifecycle
1
Client
Kotlin CMP app
HTTPS/2
mTLS pinning
2
CloudFront
CDN + WAF
DDoS shield
TLS termination
3
ALB
Health checks
Path routing
Target groups
4
Gateway
Spring Cloud GW
Filter chain
Route resolution
Gateway Filter Chain (Ordered)
A
JWT Validate
RS256 verify
Expiry check
Claim extraction
401 if invalid
B
Rate Limit
Token bucket
Redis counter
Per-user key
429 if exceeded
C
Enrich
Add X-Client-Id
X-Tier, X-Roles
X-Request-Id (trace)
X-Forwarded-For
D
Route
Match path prefix
Load balance (RR)
Circuit breaker
Timeout handling
E
Service
Target micro-
service processes
request and
returns response
Response Path
Service Response
JSON body
HTTP status
Response Filter
Strip internal headers
Add CORS, CSP, HSTS
Access Log
Method, path, status
latency, clientId, traceId
Client
Clean response
standard envelope
Security Headers (OWASP)
Strict-Transport-Security: max-age=31536000; includeSubDomains | Content-Security-Policy: default-src 'self' | X-Content-Type-Options: nosniff | X-Frame-Options: DENY | X-XSS-Protection: 0 (CSP replaces) | Referrer-Policy: strict-origin-when-cross-origin
Circuit Breaker Dashboard
Per-service health status (Resilience4j CircuitBreaker)
byld-identity
CLOSED
p99: 45ms | 0.1%
byld-mia
CLOSED
p99: 1.8s | 0.3%
byld-advisory
CLOSED
p99: 120ms | 0.1%
byld-portfolio
CLOSED
p99: 80ms | 0.05%
byld-distribution
HALF_OPEN
p99: 2.1s | 12%
byld-markets
CLOSED
p99: 35ms | 0.02%
byld-payments
CLOSED
p99: 95ms | 0.1%
byld-estate
OPEN
p99: timeout | 65%
byld-notifications
CLOSED
p99: 15ms | 0.01%
State Transitions
CLOSED → failure rate > threshold → OPEN
OPEN → wait duration expires → HALF_OPEN
HALF_OPEN → probe succeeds → CLOSED
HALF_OPEN → probe fails → OPEN
When Circuit OPEN
Response: 503 Service Temporarily Unavailable
Body: { "error": "SERVICE_DEGRADED", "retryAfter": 30 }
MIA fallback: "I'm having trouble accessing [service]. Let me try again shortly."
Notification: OpsGenie alert → on-call engineer
Resilience4j Configuration per External Integration
Integration Timeout Retry Circuit Breaker Fallback SLA Target
MFU Central 10s 3x exp backoff (1s, 2s, 4s) 50% fail / 10 calls → OPEN 60s Queue order for async retry p99 < 3s
Razorpay 8s 2x (idempotency key) 40% fail / 20 calls → OPEN 30s Payment pending state p99 < 800ms
NSE/BSE Feed 5s (connect) Auto-reconnect (exp backoff) N/A (WebSocket persistent) Serve stale Redis data (30s TTL) p99 < 50ms
Partner Broker 3s 1x (no retry for orders) 30% fail / 10 calls → OPEN 120s Reject order + notify user p99 < 100ms
Claude API 30s (SSE stream) 2x with different model (Haiku) 20% fail / 5 calls → OPEN 60s Cached response + "MIA busy" p99 < 2s
Willjini 30s 3x exp backoff 50% fail / 5 calls → OPEN 300s Draft saved locally, sync later p99 < 500ms
Ditto Insurance 5s 2x 50% fail / 10 calls → OPEN 120s Cached quotes (24h TTL) p99 < 1s
Finvu (AA) 15s 2x exp backoff 30% fail / 10 calls → OPEN 60s Show cached data + "refreshing" p99 < 2s
UIDAI (Aadhaar) 10s 3x (mandatory for KYC) 50% fail / 5 calls → OPEN 300s "KYC temporarily unavailable" p99 < 5s
All circuit breakers publish state changes to Micrometer → Prometheus → Grafana dashboards. OpsGenie alerts on OPEN state.
Architecture Notes
"The API Gateway pattern is the Facade for your microservices."
-- Chris Richardson, Microservices Patterns (2018)
com.byld.gateway
/config RouteConfig, SecurityConfig, RateLimiterConfig, CorsConfig
/filter JwtValidationFilter, RateLimitFilter, RequestEnrichFilter, AccessLogFilter
/security JwtTokenValidator, RsaKeyProvider
/resilience CircuitBreakerConfig, FallbackController
/health ServiceHealthAggregator
Key Decisions:
1. No business logic in gateway (Humble Object)
2. JWT validation only, not token issuance (byld-identity)
3. Redis for rate limiting (shared across pods)
4. Per-integration resilience configs (not one-size-fits-all)
5. All headers stripped before forwarding to client
6. Gateway is stateless (horizontally scalable)
Infrastructure
Redis: rate limit counters, session cache
No database (stateless)
Pods: min 3 (HA), max 10
Observability
Micrometer: latency, error rate, RPS
X-Ray: distributed trace propagation
Structured JSON access logs (ELK)