No description
  • Dockerfile 58.6%
  • Shell 41.4%
Find a file
Gordon a53f1b32bd config: Update Docker Compose configuration
- Changed Caddy ports from 80/443 to 1080/1443 for non-root deployment
- Changed Caddy image from caddy:2.9-alpine to caddy:latest
- Updated container names: keycloak-db -> postgres, keycloak-app -> keycloak
- Added restart: always policy for all services
- Added json-file logging with 10MB max size and 3 file rotation
- Added health_port 9000 to Caddy health check configuration
- Changed PostgreSQL healthcheck to explicit command with pg_isready
- Added KC_HTTP_ENABLED and KC_HTTP_MANAGEMENT_HEALTH_ENABLED to Keycloak
- Updated Caddy volume paths to use local directory (./caddy_volume/)
2026-04-10 16:05:15 +01:00
Caddyfile config: Update Docker Compose configuration 2026-04-10 16:05:15 +01:00
docker-compose.yml config: Update Docker Compose configuration 2026-04-10 16:05:15 +01:00
Dockerfile config: Update Docker Compose configuration 2026-04-10 16:05:15 +01:00
init-replication.sh Initial commit: Add Keycloak Docker scaleout with multi-stage build, Caddy reverse proxy, PostgreSQL replication, and sidecar backup 2026-03-31 23:12:54 +01:00
README.md docs: Update README.md with Docker Compose configuration details 2026-04-10 16:05:15 +01:00
sample.env Initial commit: Add Keycloak Docker scaleout with multi-stage build, Caddy reverse proxy, PostgreSQL replication, and sidecar backup 2026-03-31 23:12:54 +01:00

Keycloak Docker Scaleout

Docker Compose Configuration Details

This project uses several Docker Compose features for production-grade deployments:

Logging Configuration

All services use the json-file logging driver with the following settings:

  • Max Size: 10MB per log file
  • Max Files: 3 files (rotation)

This prevents log files from consuming excessive disk space. View logs with:

docker logs keycloak-proxy
docker logs postgres
docker logs keycloak
docker logs keycloak-db-backup

Restart Policies

All services are configured with restart: always which ensures:

  • Containers automatically restart if they crash
  • Containers restart after system reboot
  • Services come up in the correct order after restart

View container restart count with:

docker compose ps

Health Checks

PostgreSQL uses the pg_isready command to verify the database is accepting connections. Keycloak waits for PostgreSQL to pass its health check before starting.


A production-ready solution for deploying Keycloak with PostgreSQL using Docker Compose, designed for seamless transition from a single VPS to a multi-region cluster architecture.

A production-ready solution for deploying Keycloak with PostgreSQL using Docker Compose, designed for seamless transition from a single VPS to a multi-region cluster architecture.

Architecture Overview

This setup uses:

  • Caddy as a reverse proxy with automatic TLS certificate management via Let's Encrypt

  • Keycloak 26 with a multi-stage Docker build for optimized production deployment

  • PostgreSQL 17 with physical streaming replication support

  • Sidecar backup container for automated database backups

  • JDBC-based clustering for automatic node discovery

  • HTTP enabled on port 8080 (management health on port 9000)

  • Production optimizations via --optimized flag

  • Separate management port for health checks (KC_HTTP_MANAGEMENT_HEALTH_ENABLED=true)


Prerequisites

Before you begin, ensure you have:

  1. Domain Name: A domain name pointing to your primary VPS IP address
  2. VPS Requirements:
    • Linux-based VPS (Ubuntu 22.04+ or similar)
    • At least 4GB RAM (8GB recommended for production)
    • Docker and Docker Compose installed
  3. Network Access: Ports 1080 and 1443 must be open for Caddy (or configure your firewall for the ports you choose)

Preparation

Step 1: Directory Setup

Put all files in a single directory on your primary VPS:

mkdir -p ~/keycloak-cluster
cd ~/keycloak-cluster
# Copy all files from this repository into this directory

Note: This setup uses named volumes for Caddy data (caddy_data) and config (caddy_config), and local directories for PostgreSQL data (postgres_data) and backups (backups).

Step 2: Domain Configuration

Ensure your domain name is already pointing to your VPS IP address:

# Verify DNS is working
nslookup your-auth-domain.com
# or
dig your-auth-domain.com

Let's Encrypt validation requires the domain to resolve correctly before certificate issuance.

Step 3: Environment Configuration

Copy and configure the sample environment file:

cp sample.env .env
nano .env  # or use your preferred editor

Configure the following variables:

Variable Description Example
DOMAIN Your Keycloak hostname auth.example.com
LETSENCRYPT_EMAIL Email for Let's Encrypt certificates admin@example.com
POSTGRES_DB Database name for Keycloak keycloak
POSTGRES_USER Database user for Keycloak keycloak
POSTGRES_PASSWORD Password for the database user Secure password
REPLICATION_USER User for PostgreSQL replication replicator
REPLICATION_PASSWORD Password for replication user Secure password
KEYCLOAK_ADMIN Keycloak admin username admin
KEYCLOAK_ADMIN_PASSWORD Keycloak admin password Secure password
KC_DB_PASSWORD Database password for Keycloak Same as POSTGRES_PASSWORD
BACKUP_SCHEDULE Cron schedule for backups @daily
BACKUP_KEEP_DAYS Number of days to retain backups 7

Service Ports

Service Internal Port External Port Purpose
Caddy (HTTP) 80 1080 Web proxy for Keycloak
Caddy (HTTPS) 443 1443 TLS-terminated web proxy
PostgreSQL 5432 - Internal database access
Keycloak 8080 - Internal application server
Keycloak Management 9000 - Internal health checks

Deployment

Step 1: Start the Initial Cluster

Run Docker Compose to deploy all services:

docker compose up -d

This will:

  1. Build the optimized Keycloak Docker image
  2. Start PostgreSQL with replication-ready configuration
  3. Initialize the database with replication user and slot
  4. Start Keycloak connected to the database with HTTP and management health endpoints enabled
  5. Start Caddy on ports 1080 (HTTP) and 1443 (HTTPS) which automatically requests TLS certificates
  6. Start the backup container

Note: Caddy is configured to use ports 1080 and 1443 instead of the standard 80 and 443. Adjust your firewall rules accordingly. The Caddy container image has been updated to caddy:latest for the most recent features and security patches.

Step 2: Verify Deployment

Check that all services are running:

docker compose ps

Expected output shows all containers as "Up":

NAME                IMAGE                       STATUS
keycloak-proxy      caddy:latest                Up
postgres            postgres:17-alpine          Up
keycloak            keycloak-docker-scaleout    Up
keycloak-db-backup  prodrigestivill/...         Up

Note: All containers are configured with restart: always to ensure automatic recovery after failures or system reboots.

Step 3: Access Keycloak

Wait 1-2 minutes for Let's Encrypt certificates to be issued, then access:

https://your-domain.com:1443

Note: Since Caddy is configured on port 1443 instead of 443, you need to include the port in the URL unless you configure a reverse proxy or firewall to redirect traffic.

You should see the Keycloak login page with a valid HTTPS certificate.


Database Replication Setup

Understanding the Replication Strategy

For scaling to multiple regions, PostgreSQL uses physical streaming replication where:

  • The primary VPS (initial deployment) acts as the primary server
  • Additional VPS instances act as standby servers that stream changes
  • Read queries can be distributed to standbys, writes go to primary

Step 1: Prepare the Primary Database

The initial deployment automatically:

  1. Creates a replication user (REPLICATION_USER)
  2. Creates a physical replication slot
  3. Updates pg_hba.conf to allow replication connections

Verify the replication slot exists:

docker exec postgres psql -U postgres -c "SELECT * FROM pg_replication_slots;"

Step 2: Create Standby Nodes

On each secondary VPS (future standby server):

  1. Install Docker and Docker Compose
  2. Create a directory and copy the necessary files (Dockerfile is not needed)
  3. Create a standby-compose.yml file:
version: '3.8'

services:
  postgres-standby:
    image: postgres:17-alpine
    container_name: keycloak-db-standby
    environment:
      POSTGRES_DB: keycloak
      POSTGRES_USER: keycloak
      POSTGRES_PASSWORD: your_secure_password
    command: >
      postgres -c wal_level=replica
               -c max_wal_senders=10
               -c max_replication_slots=10
               -c hot_standby=on
    volumes:
      - ./postgres_data:/var/lib/postgresql/data
    networks:
      - keycloak-net

networks:
  keycloak-net:
    driver: bridge
  1. Run the standby database:
# Stop and remove any existing container
docker compose down

# Clone the primary database using pg_basebackup
docker run --rm \
  -v $(pwd)/postgres_data:/var/lib/postgresql/data \
  postgres:17-alpine \
  pg_basebackup -h <primary-vps-ip> -U replicator -D /var/lib/postgresql/data -P -Xs -P -R --slot=replication_slot_primary

# Replace <primary-vps-ip> with your primary VPS IP address

# Start the standby database
docker compose up -d

The -R flag creates a standby.signal file and postgresql.auto.conf with connection info.

  1. Verify replication is working:
docker exec keycloak-db-standby psql -U postgres -c "SELECT * FROM pg_stat_replication;"

On the primary, you should see the standby connection listed.


Scaling Out Keycloak Nodes

Step 1: Deploy Additional Keycloak Instances

On each additional VPS where you want to run Keycloak:

  1. Copy the Dockerfile and create a keycloak-compose.yml:
version: '3.8'

services:
  keycloak:
    build: .
    container_name: keycloak-app
    command: start --optimized
    environment:
      KC_DB: postgres
      KC_DB_URL: jdbc:postgresql://<primary-vps-ip>:5432/keycloak
      KC_DB_USERNAME: keycloak
      KC_DB_PASSWORD: your_db_password
      KC_HOSTNAME: your-cluster-domain.com
      KC_PROXY_HEADERS: xforwarded
      KEYCLOAK_ADMIN: admin
      KEYCLOAK_ADMIN_PASSWORD: your_admin_password
      # Clustering configuration
      KC_PROFILE: ha
      KC_TRANSPORT: jdbc-ping
    networks:
      - keycloak-net
    restart: unless-stopped

networks:
  keycloak-net:
    driver: bridge
  1. Start the additional Keycloak node:
docker compose up -d

Step 2: Verify Clustering

Keycloak 26 uses JDBC-based clustering with jdbc-ping for automatic node discovery. All nodes connect to the same PostgreSQL database and automatically discover each other.

Check that nodes are discovering each other by viewing the logs:

docker logs keycloak | grep -i cluster

You should see messages about cluster membership and node discovery.

Step 3: Add Standby Nodes to Caddy Load Balancing

Update your primary Caddyfile to include all Keycloak nodes:

{$DOMAIN} {
    reverse_proxy keycloak:8080 your-secondary-node-ip:8080 {
        header_up X-Forwarded-Proto {scheme}
        lb_policy cookie
        health_uri /health/live
        health_interval 10s
        health_status 200
    }
}

Reload Caddy configuration:

docker exec keycloak-proxy caddy reload --config /etc/caddy/Caddyfile

Note: When scaling to multiple Keycloak nodes, each node should be configured with the same KC_DB_URL pointing to your primary PostgreSQL database.


Backup and Recovery

Automated Backups

The pg-backup container runs on a schedule defined in .env:

# View backup logs
docker logs keycloak-db-backup

# List backups
ls -la backups/

Manual Backup

Trigger a backup on demand:

docker exec keycloak-db-backup pg_dump -U $POSTGRES_USER $POSTGRES_DB > backup.sql

Restore from Backup

# Stop services
docker compose down

# Restore the backup (use 'postgres' container name)
docker exec -i postgres psql -U $POSTGRES_USER -d $POSTGRES_DB < backup.sql

# Start services
docker compose up -d

Monitoring and Health Checks

Keycloak Health Endpoints

Keycloak exposes health checks at:

  • Liveness: https://your-domain.com/health/live
  • Readiness: https://your-domain.com/health/ready
  • Metrics: https://your-domain.com/health/metrics

Database Health

Check PostgreSQL replication status on primary:

docker exec postgres psql -U postgres -c "SELECT * FROM pg_stat_replication;"

Check if standby is in recovery mode:

docker exec postgres-standby psql -U postgres -c "SELECT pg_is_in_recovery();"

Troubleshooting

Let's Encrypt Certificate Issues

If certificates fail to issue:

# Check Caddy logs
docker logs keycloak-proxy

# Verify DNS and port 80 access
curl -I http://your-domain.com/.well-known/acme-challenge/

Database Connection Failures

# Check PostgreSQL is accepting connections
docker exec postgres pg_isready

# Check Keycloak logs
docker logs keycloak | grep -i postgres

Replication Issues

# Check replication lag on primary
docker exec postgres psql -U postgres -c "SELECT * FROM pg_stat_replication;"

# Check if standby is receiving WAL
docker exec postgres-standby psql -U postgres -c "SELECT * FROM pg_stat_wal_receiver;"

Keycloak Clustering Issues

# Verify JDBC-Ping is enabled
docker logs keycloak | grep -i "jdbc-ping"

# Check cluster membership
docker logs keycloak | grep -i "JGroups"

Production Checklist

Before going to production:

  • All passwords are strong and unique
  • TLS certificates are issued and valid
  • Database replication is working
  • Backups are running on schedule
  • Monitoring and alerting are configured
  • Firewall rules are restrictive (only allow necessary ports)
  • Regular security updates are planned
  • Disaster recovery procedures are documented

Additional Resources