The Real Problem
Microservices: Hype vs Reality
- Deployment independence - ship the payment module without touching user auth
- Blast radius containment - when (not if) things break, they break small
- Polyglot persistence - use Postgres for transactions, Redis for sessions, MongoDB for documents
- Team ownership - clear boundaries mean clear responsibility
- Targeted scaling - scale the search service during peak, not the entire app
Architecture Deep Dive
- Aggregate boundaries define service boundaries - if it's one transaction, it's one service
- Events over sync calls - choreography beats orchestration for most cases
- API contracts as first-class citizens - break the contract, break the build
- Shared nothing architecture - each service owns its data, period
- Observability isn't optional - if you can't trace it, don't ship it
The Stack (And Why)
Infrastructure:
โโโ Kubernetes (EKS) โ Declarative deployments, self-healing
โโโ Istio service mesh โ mTLS, traffic shaping, circuit breaking
โโโ Kong API Gateway โ Rate limiting, auth, request transformation
โ
Messaging:
โโโ Kafka โ Event backbone, 7-day retention
โโโ Redis Streams โ Lightweight pub/sub, ephemeral data
โ
Data Layer:
โโโ PostgreSQL โ ACID transactions, JSONB for flexibility
โโโ MongoDB โ Document store for audit logs, activity feeds
โโโ Redis Cluster โ Session store, distributed caching
โโโ Elasticsearch โ Full-text search, log aggregation
โ
Observability:
โโโ OpenTelemetry โ Vendor-agnostic instrumentation
โโโ Prometheus + Thanos โ Metrics with long-term storage
โโโ Grafana โ Dashboards, alerting
โโโ Jaeger โ Distributed tracing
โ
CI/CD:
โโโ GitLab CI โ Build, test, security scanning
โโโ ArgoCD โ GitOps deployments
โโโ Sealed Secrets โ K8s-native secret management
Patterns That Saved Us
The Payment Flow: Real Architecture
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ API Gateway โโโโโโโโ Payment โโโโโโโโ Fraud โ
โ (Kong) โ gRPC โ Service โ Eventโ Detection โ
โโโโโโโโโโโโโโโ โโโโโโโโฌโโโโโโโ โโโโโโโโฌโโโโโโโ
โ โ
PaymentInitiated FraudCheckCompleted
โ โ
โผ โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Outbox โ โ Risk โ
โ Table โ โ Scoring โ
โโโโโโโโฌโโโโโโโ โโโโโโโโฌโโโโโโโ
โ โ
Debezium CDC RiskAssessed
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Kafka Topics โ
โ payments.initiated โ
โ fraud.checked โ
โ risk.assessed โ
โ payments.completed โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Ledger โ โ Order โ โNotification โ
โ Service โ โ Service โ โ Service โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
Testing Strategy That Actually Works
- Contract tests (Pact) - Services talk to stubs, not real dependencies. Contracts break in CI, not production
- Consumer-driven contracts - Consumers define what they need, providers prove they deliver
- Chaos testing (Chaos Monkey, Litmus) - Kill pods randomly. Inject latency. Prove resilience
- Synthetic monitoring - Continuous production probes for critical user journeys
- Load testing as validation - We validated the architecture could handle 80x the original traffic through rigorous load tests before launch
- Canary deployments - 1% traffic to new versions, automatic rollback on error spike
What Actually Happened
- Response times dropped from 850ms p99 to under 120ms p99
- Zero-downtime deployments - maintenance windows became a memory
- Infrastructure costs down 35% despite higher traffic
- Deployment frequency: from monthly to 50+ daily deploys
- Mean time to recovery: under 5 minutes for most incidents
The Hard Lessons
When NOT to Microservices
- Your team is small - coordination overhead will kill velocity
- Domain boundaries are unclear - you'll draw them wrong and suffer migration pain
- You don't have platform engineering capacity - infrastructure complexity explodes
- Latency requirements are extreme - network hops add up
- Your monolith just needs better modularization - try a modular monolith first
Takeaways
tags
conclusion
Engineering Team
Senior Solutions Architects
We've been building distributed systems since before 'microservices' was a thing. Our scars tell stories.