Big Data Reporting Platform Processing 2-4M Daily Transactions

Big Data Reporting Platform Processing 2-4M Daily Transactions

Modern GKE-based Microservices Replacing 20-Year Legacy System

2019 - 2020
Senior Backend Developer & Platform Architect (Big Data Specialist)
2-4M
Daily Transactions
Card payments processed across all stores and online shops
20 Years
Legacy Replaced
Modernized decades-old manual reporting system
Thousands
Store Locations
Individual conditions per location supported
Real-Time
Processing
Kafka/Pub/Sub event-driven architecture

Project Gallery

Financial analytics command center processing millions of daily transactions

Big Data Banking Platform

Financial analytics command center processing millions of daily transactions

The Challenge

Replacing 20-Year Manual Reporting System with Modern Big Data Platform

This leading German retail banking group needed to renew its card payment processing and reporting infrastructure. The challenge was to replace a 20-year-old manually operated system with a modern, automated platform capable of processing 2-4 million daily transactions from thousands of stores and online shops, including complex customer loyalty programs with individualized conditions per location.

1

Legacy 20-year-old manual reporting and billing system unable to scale

2

2-4 million daily card payment transactions requiring real-time processing

3

Each location with unique, sometimes individual terms and conditions

4

Complex customer loyalty program integration across thousands of stores

5

Need for high-performance data access with massive data volumes

6

Multiple data sources requiring integration (stores, online shops, loyalty systems)

7

Requirement for clean-code architecture on entirely new Google Cloud infrastructure

The Solution

Modern Big Data Platform on GKE with Kafka & Event-Driven Architecture

We architected and implemented a completely new big data reporting platform on Google Cloud Platform using GKE (Google Kubernetes Engine). The solution uses event-driven microservices with Kafka and Pub/Sub for high-throughput transaction processing, combined with optimized PostgreSQL partitioning and sophisticated JPA lazy-loading for efficient handling of massive data volumes.

1

Microservices on GKE

Docker containerized microservices built with Java 11/Kotlin and Spring Boot 2.3 on Google Kubernetes Engine

2

Event-Driven Messaging

Apache Camel + Kafka for cluster communication, Pub/Sub for Google Cloud Functions integration, with synchronous calls implemented via Kafka

3

Optimized Data Layer

Spring/Hibernate JPA with sophisticated lazy-loading strategy and PostgreSQL table partitioning for handling massive transaction volumes

4

Cloud Functions Processing

Google Cloud Functions (Java 11, Python 3, NodeJS) for external data processing via Pub/Sub

5

RESTful API Design

OpenAPI YAML-generated REST interfaces with Spring HATEOAS access graph

6

PDF Generation Service

NodeJS/Puppeteer-based dynamic PDF generator triggered via Pub/Sub with auto-scaling

Critical Challenges

Key technical hurdles and how they were overcome

1

Daily Big Data Processing with Resource Efficiency

Problem

Massive data volumes arriving in short windows (often within 2 hours), then no resources needed rest of day. Extremely wasteful with traditional architecture requiring constant infrastructure allocation regardless of load patterns.

Solution

Modern, cloud-based GCP solution with Cloud Functions, Pub/Sub, and auto-scaling services. When no processing required, zero resources consumed. During data ingestion, computing hardware scales up automatically to handle peak loads of thousands of entries per second.

Infrastructure costs slashed by 50% while simultaneously handling 10,000+ peak transactions per second - proving that efficiency and performance aren't mutually exclusive.

Impact

Stack processes thousands of entries per second when needed, but extremely cost-efficient when idle. Agile, slim, and dramatically reduced costs through elastic scaling. Perfect alignment between resource consumption and actual workload.

Business Impact

Measurable value delivered to the business

Infrastructure Cost Savings

€180k/year

50% reduction compared to traditional €360k architecture through elastic GCP auto-scaling and Cloud Functions

Processing Capacity

10,000+ TPS

Peak transactions per second handled during batch windows with sub-200ms P99 latency

Query Performance Improvement

100x faster

Reports that previously took minutes now complete in seconds via PostgreSQL partitioning and optimization

Operational Efficiency

20-year legacy replaced

Automated manual processes, enabling real-time reporting for thousands of locations

Innovations

Groundbreaking solutions that set new standards

Elastic Big Data Processing

GCP auto-scaling architecture that scales from zero to thousands of transactions per second based on actual workload

Unprecedented cost efficiency for big data workloads - pay only for processing time used, not idle capacity

Impact: 50% cost reduction (€180k/year savings) while improving performance and handling 10,000+ peak TPS

Advanced PostgreSQL Partitioning Strategy

Sophisticated table partitioning with master-master replication enabling 100x query performance on massive transaction volumes

Reports that took minutes now complete in seconds, even across 2-4 million daily transactions

Impact: Transformed reporting from batch-delayed to near-real-time insights for business decisions

Multi-Language Cloud Functions Integration

Seamless integration of Java 11, Python 3, and NodeJS Cloud Functions via Pub/Sub for specialized processing tasks

Right tool for each job - language-agnostic architecture choosing optimal runtime per use case

Impact: PDF generation, data transformations, and external integrations optimized for performance and maintainability

Sophisticated JPA Lazy-Loading Schema

Highly optimized Hibernate configuration with selective, efficient lazy-loading for high-performance data access on massive datasets

Minimizes memory footprint and database load while maintaining sub-200ms latencies under extreme transaction volumes

Impact: Enables handling millions of entities efficiently with consistent performance even at peak loads

"The new big data platform transformed our transaction processing capabilities. Moving from a 20-year manual system to a modern cloud-native architecture that handles millions of transactions daily was a game-changer."

F
Former IT Director, leading German retail banking group
Online Services

Technologies Used

core

Java 11 Kotlin Spring Boot 2.3 Gradle

persistence

PostgreSQL 12 MongoDB 4.3 Spring/Hibernate JPA

messaging

Apache Kafka Google Pub/Sub Apache Camel Akka

infrastructure

Google Cloud Platform GKE (Google Kubernetes Engine) Docker

cloud Functions

Google Cloud Functions Python 3 NodeJS Java 11 Maven

frontend

Vue.js Jest TypeScript JavaScript

integration

Spring HATEOAS OpenAPI RESTful APIs

testing

Cucumber Gauge (BDD) Gatling (Performance) Jest

additional

Vavr Lombok Puppeteer (PDF Generation)

Need Big Data Platform for High-Volume Transactions?

If your organization requires a modern, scalable platform for processing millions of daily transactions with complex business rules, let's discuss your requirements.

Schedule Consultation