Backend EngineeringJavaSpringSpring BootKafkaMicroservicesDistributed SystemsArchitectureDevOps

Senior Java Backend Architecture Guide: From Spring Boot to Kafka, Microservices, and Production Systems

Satyam Parmar
January 20, 2025
9 min read

Senior Java Backend Architecture Guide

If you are moving from mid-level to senior, your job shifts from writing endpoints to shaping reliable systems. This guide is a practical roadmap with what to learn, why it matters, and how to implement it in Java/Spring Boot with Kafka. Every section ties the topic to microservices and distributed systems impact.


Table of Contents

  • Core Java and JVM
  • Concurrency and Reactive
  • Build, Packaging, and Dependency Hygiene
  • Spring Core and Spring Boot
  • HTTP APIs (REST)
  • Persistence: SQL with JPA and JDBC
  • NoSQL and Caching (Redis)
  • Messaging with Kafka
  • Transactions, Idempotency, and Outbox
  • Microservices Architecture Components
  • API Gateway and Edge Patterns
  • Resiliency (Retries, Circuit Breakers, Rate Limits)
  • Observability (Logs, Metrics, Traces)
  • Security (OAuth2/OIDC)
  • Testing (Unit, Integration, Contract)
  • CI/CD, Containers, and Kubernetes
  • Monitoring Stack and SLOs
  • Capstone Project: User / Order / Payment (Kafka + Outbox + Gateway + Observability)
  • Printable Service Checklist
  • Step-by-Step Roadmap

Core Java and JVM

  • What: Collections, generics, streams, records, sealed classes; JVM memory model, GC.
  • Why: Data structures and memory behavior drive performance and latency.
  • How: Prefer immutable DTOs and defensive copies on boundaries; use parallel streams carefully. Java snippet:
record UserDto(String id, String name) {}
List<String> names = users.stream().map(User::getName).toList();

Concurrency and Reactive

  • What: Thread pools, CompletableFuture, virtual threads (Project Loom), reactive (Reactor).
  • Why: Throughput and tail latency depend on non-blocking IO and correct backpressure.
  • How: Use bounded executors; for high concurrency IO, consider WebFlux or Loom. Java snippet (CompletableFuture):
ExecutorService io = Executors.newFixedThreadPool(64);
CompletableFuture<Response> f = CompletableFuture.supplyAsync(() -> client.call(), io);

Build, Packaging, and Dependency Hygiene

  • What: Gradle/Maven, BOMs, dependency convergence, layered jars, Docker images.
  • Why: Reproducible builds and small images reduce CVEs and deploy time.
  • How: Use Spring Boot layered jar and slim base images. Example Dockerfile:
FROM eclipse-temurin:21-jre
ARG JAR=app.jar
COPY build/libs/${JAR} /app.jar
ENTRYPOINT ["java", "-XX:+UseZGC", "-jar", "/app.jar"]

Spring Core and Spring Boot

  • What: DI, configuration properties, profiles, actuator.
  • Why: Clean composition and production toggles are essential for microservices.
  • How: Externalize config, validate @ConfigurationProperties, expose health and info. Java snippet:
@ConfigurationProperties(prefix = "service")
record ServiceProps(String name, Duration timeout) {}

HTTP APIs (REST)

  • What: Controllers, DTO validation, error handling, idempotency.
  • Why: APIs are the product surface; correctness and predictability reduce incidents.
  • How: Version your API, add problem+json errors, define idempotency keys for writes. Java snippet:
@RestController
@RequestMapping("/v1/users")
class UserController {
  private final UserService svc;
  UserController(UserService s){ this.svc = s; }
  @PostMapping
  ResponseEntity<UserDto> create(@Valid @RequestBody CreateUser req,
                                 @RequestHeader(value="Idempotency-Key", required=false) String idk){
    return ResponseEntity.status(HttpStatus.CREATED).body(svc.create(req, idk));
  }
}

Persistence: SQL with JPA and JDBC

  • What: JPA/Hibernate vs plain JDBC; transactions; connection pools.
  • Why: Schema and queries define scalability; lazy loading traps; N+1 patterns.
  • How: Use explicit DTO projections, batch writes, and connection timeouts. JPA snippet:
public interface UserRepo extends JpaRepository<UserEntity, String> {
  @Query("select new com.acme.UserDto(u.id,u.name) from UserEntity u where u.status=:s")
  List<UserDto> findByStatus(@Param("s") Status status);
}

NoSQL and Caching (Redis)

  • What: Key-value (Redis), document (Mongo), column (Cassandra).
  • Why: Latency and scale; choose per access pattern; avoid cache stampedes.
  • How: Cache aside with TTL; prefer small value objects; compress large payloads. Redis pseudo-YAML:
spring:
  data:
    redis:
      host: redis:6379
      timeout: 100ms

Messaging with Kafka

  • What: Topics, partitions, consumer groups, delivery semantics.
  • Why: Decoupling and scale; async flows; backpressure via consumer lag.
  • How: Keys define partitioning; configure acks, retries, idempotence. Spring Kafka config (application.yaml):
spring:
  kafka:
    bootstrap-servers: localhost:9092
    producer:
      acks: all
      retries: 5
      properties:
        enable.idempotence: true
    consumer:
      group-id: user-svc
      auto-offset-reset: earliest

Java consumer:

@KafkaListener(topics="user-events", groupId="user-svc")
void on(UserEvent evt){ handler.process(evt); }

Transactions, Idempotency, and Outbox

  • What: Exactly-once is a workflow property; implement idempotency and outbox.
  • Why: Prevent double charges and lost messages in distributed boundaries.
  • How: Write to DB within tx + outbox row, Debezium/connector publishes to Kafka. Outbox table:
create table outbox(
  id uuid primary key,
  aggregate_id varchar(64),
  type varchar(64),
  payload json,
  created_at timestamp
);

Microservices Architecture Components

  • What: Service discovery (Eureka/Consul), config server, API gateway, centralized auth.
  • Why: Operate many services with consistent cross-cutting concerns.
  • How: Spring Cloud Config + Consul; minimize service-to-service dynamic deps.

API Gateway and Edge Patterns

  • What: Routing, rate-limiting, auth, request shaping.
  • Why: Single ingress for policies and observability.
  • How: Spring Cloud Gateway or Kong/Apigee at edge; validate and normalize headers. Gateway route (yaml):
spring:
  cloud:
    gateway:
      routes:
        - id: user-api
          uri: http://user:8080
          predicates: [ Path=/v1/users/** ]
          filters: [ RemoveRequestHeader=Cookie ]

Resiliency (Retries, Circuit Breakers, Rate Limits)

  • What: Resilience4j for retries/circuit; token bucket for rate limits.
  • Why: Isolate failures and prevent cascades.
  • How: Configure jittered retries; set timeouts smaller than upstream timeouts. Java snippet:
@Retry(name="userRetry")
@CircuitBreaker(name="userCb")
public UserDto callUpstream(String id){ return client.getUser(id); }

Observability (Logs, Metrics, Traces)

  • What: Structured logs, Micrometer metrics, OpenTelemetry traces.
  • Why: You cannot fix what you cannot see; SLOs need signals.
  • How: JSON logs; Micrometer to Prometheus; OTLP exporter to Jaeger/Tempo. Micrometer counter:
Counter created = Counter.builder("user_created_total").register(meterRegistry);
created.increment();

OTel exporter (yaml):

management:
  otlp:
    tracing:
      endpoint: http://otel-collector:4317

Security (OAuth2/OIDC)

  • What: Spring Security with resource server; Keycloak/Okta as IdP.
  • Why: Token-based auth scales across services; zero trust perimeter.
  • How: Bearer tokens with scopes; fine-grained authorities via claims. Java config:
@EnableWebSecurity
class SecCfg {
  @Bean SecurityFilterChain http(HttpSecurity h) throws Exception {
    h.authorizeHttpRequests(a -> a.requestMatchers("/actuator/**").permitAll()
                                   .anyRequest().authenticated())
     .oauth2ResourceServer(o -> o.jwt());
    return h.build();
  }
}

Testing (Unit, Integration, Contract)

  • What: JUnit5, Testcontainers, WireMock, Pact.
  • Why: Prevent regressions and verify contracts across services.
  • How: Run PostgreSQL/Kafka via Testcontainers in CI for realism. JUnit + Testcontainers:
@Container static PostgreSQLContainer<?> pg = new PostgreSQLContainer<>("postgres:16");

CI/CD, Containers, and Kubernetes

  • What: Pipelines (GitHub Actions), image build, deployment strategies.
  • Why: Safe, fast releases; progressive rollouts reduce risk.
  • How: Blue/green or canary; Helm charts; GitOps (ArgoCD). K8s snippet:
apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: user
        image: acme/user:1.0.0
        resources:
          requests: { cpu: "200m", memory: "256Mi" }
          limits:   { cpu: "1",    memory: "512Mi" }

Monitoring Stack and SLOs

  • What: Prometheus, Grafana, Loki/ELK, Jaeger.
  • Why: Close the loop with alerts on SLO burn rates.
  • How: Build RED and USE dashboards; set actionable alerts with runbooks.

Capstone Project: User / Order / Payment (Kafka + Outbox + Gateway + Observability)

Goal: ship three services with reliable messaging, API gateway, and full telemetry.

Repo structure:

acme-platform/
  gateway/                # Spring Cloud Gateway
  user-service/           # PostgreSQL + outbox table
  order-service/          # PostgreSQL + outbox table
  payment-service/        # PostgreSQL + outbox table
  infra/
    docker-compose.yml    # Postgres, Kafka, Schema Registry, Debezium, Prometheus, Grafana, Jaeger
    topics.sh             # create topics: user-events, order-events, payment-events

Gateway routes (application.yaml):

spring:
  cloud:
    gateway:
      routes:
        - id: user
          uri: http://user-service:8080
          predicates: [ Path=/v1/users/** ]
        - id: order
          uri: http://order-service:8080
          predicates: [ Path=/v1/orders/** ]
        - id: payment
          uri: http://payment-service:8080
          predicates: [ Path=/v1/payments/** ]

Outbox table (shared shape across services):

create table if not exists outbox(
  id uuid primary key,
  aggregate_id varchar(128) not null,
  type varchar(64) not null,
  payload jsonb not null,
  created_at timestamptz default now()
);
create index if not exists idx_outbox_created on outbox(created_at);

Spring profiles (user-service application.yaml):

spring:
  datasource:
    url: jdbc:postgresql://postgres:5432/userdb
    username: user
    password: pass
  jpa:
    hibernate:
      ddl-auto: validate
  kafka:
    bootstrap-servers: kafka:9092
    producer:
      acks: all
      properties: { enable.idempotence: true }
management:
  endpoints:
    web.exposure.include: health,info,prometheus

Debezium connector (user outbox -> Kafka):

{
  "name": "user-outbox",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "postgres",
    "database.port": "5432",
    "database.user": "debezium",
    "database.password": "dbz",
    "database.dbname": "userdb",
    "table.include.list": "public.outbox",
    "topic.prefix": "user",
    "tombstones.on.delete": "false"
  }
}

OpenTelemetry + Micrometer (user-service application.yaml):

management:
  tracing:
    sampling.probability: 0.1
  otlp:
    tracing.endpoint: http://otel-collector:4317
  metrics.export.prometheus.enabled: true

Consumer pattern (order-service consumes user events):

@KafkaListener(topics = "user.public.outbox")
void on(ConsumerRecord<String,String> rec){
  // parse payload, apply idempotency using event id
}

docker-compose (infra/docker-compose.yml) highlights:

services:
  postgres: { image: postgres:16 }
  kafka: { image: confluentinc/cp-kafka:7.6.0 }
  schema-registry: { image: confluentinc/cp-schema-registry:7.6.0 }
  debezium: { image: debezium/connect:2.6 }
  prometheus: { image: prom/prometheus }
  grafana: { image: grafana/grafana }
  jaeger: { image: jaegertracing/all-in-one }

Printable Service Checklist

Copy, paste, and print per service (User / Order / Payment).

  • Deployment

    • Image uses slim JRE; layered jar; SBOM stored.
    • Readiness/liveness probes configured.
    • Resource requests/limits set with headroom.
    • Config/secret via env or vault; no secrets in images.
  • Resilience

    • Timeouts set on all clients; retries with jitter; circuit breakers on remote calls.
    • Bulkheads (thread pools) bounded; rate limits at gateway and service.
    • Idempotency keys for writes; outbox + CDC for cross-service events.
    • DLQ and replay plan documented; backoff tuned.
  • Observability

    • Structured JSON logs with correlation/trace IDs.
    • Micrometer metrics: RED (requests, errors, duration).
    • Traces exported via OTLP; percent sampled and adjustable.
    • Dashboards and alerts exist with runbooks; SLO defined.

Step-by-Step Roadmap

  1. Java and JVM fundamentals; collections, streams, records.
  2. Concurrency: thread pools, futures, timeouts; intro to Reactor or Loom.
  3. Spring Boot core: configs, profiles, actuator; solid REST APIs.
  4. SQL mastery: normalization, indexing, transactions, JPA pitfalls.
  5. NoSQL + Redis: pick per access pattern; cache correctness and TTLs.
  6. Kafka: producers/consumers, keys, partitions, idempotent writes.
  7. Resiliency: retries, timeouts, circuit breakers, bulkheads; rate limits.
  8. Observability: logs, Micrometer metrics, OpenTelemetry traces.
  9. Security: OAuth2/OIDC resource server; propagate identity across services.
  10. Architecture components: config, discovery, gateway; cross-cutting policies.
  11. Data correctness patterns: idempotency, outbox, CDC; backfills and replay.
  12. Platform: Docker, K8s, Helm; blue/green and canary; GitOps.
  13. Monitoring and SLOs: burn rate alerts, triage, and postmortems.

Focus on outcomes: lower P99 latency, fewer incident tickets, faster safe releases.

Related Articles

Home