Reliable apps with (HA)Proxy — Part 1.

Reliable apps with (HA)Proxy — Part 1.

Club Sandwich

Starting usage pattern is a “sandwich” like structure, where our application is between the two HAProxy layers. All incoming traffic is going through HAProxy, which provides protection, SSL termination, and logging, while my app is not exposed to the outside world (it usually listens only on localhost/loopback). In addition to that, all outgoing (server-initiated) traffic is also going through HAProxy (database stuff, Redis/Memcached, connection to other HTTP services, logging).

global
    log stderr local0 debug
    log 127.0.0.1:514 len 8192 format rfc5424 local1 debug
    ...
defaults
    log global
    mode http
    option httplog clf
    option dontlognull
    option forwardfor
    # healthchecks
    option httpchk
    default-server check inter 1s fall 5 downinter 2s rise 5
    # timeouts
    timeout connect 5s
    timeout client 10s
    timeout server 10s
    timeout http-request 1s
listen myapp
    http-check send meth GET uri /health
    bind :80
    bind :443 ssl crt /path/to/dir/with/certs/ alpn h2,http/1.1
    server myapp01 127.0.0.1:8080 check maxconn 32
listen myservice
    bind 127.0.0.1:8081
    server myservice01 1.2.3.4:443 ssl verify required ca-file CA_chain.pem

HAProxy provides a very rich set of HTTP server monitoring options. You can configure regular check intervals, change check intervals when the service is in transition or down, write your own checks with binary payload or use Lua API to write complicated checks.

We’ve also configured HTTP access logs in Common Log Format (think of Apache and Nginx), available without performance cost or the need to produce HTTP logs from your application.

It can also provide out of the box protection from some type of attacks such as Slowloris (just by the fact that it can enforce various timeouts). Or it can prevent overloading of your service by keeping the tabs on the number of concurrent connections (maxconn).

We’ve also added a fictional service, to which we can connect locally over plain HTTP, and HAProxy will convert that to proper HTTPS, with certificate checking (often you use self signed certs for internal services, and it’s pain to make that work with application code).

Double PostgreSQL

Even when working with a single PostgreSQL server, it is nice not to worry about database location from the app point of view, we want to pretend that database is always on the localhost. Also, adding some resilience with HAProxy is very easy, just add another (backup) server.

listen pgsql
    mode tcp
    option pgsql-check postgres_user
    bind 127.0.0.1:5432
    server primary 10.0.0.1:5432 check
    server secondary 10.0.0.2:5432 check backup

As you can see, HAProxy can natively check for PostgreSQL availability, which is better than just testing for TCP port availability. In case something happens to the first server, it will try to route traffic to the second “backup” server. Without changing a single line of code in your application, you’ve increased reliability and availability!

We won’t go into too many details about configuring PostgreSQL replication here. But we can warmly recommend using reprmgr and barman combination (or pgBackRest for large DBs). Enough said.

When you cross the line of running multiple PostgreSQL servers for failover, you often want to use replica servers for read-only queries (i.e. “read replicas”), since a large percentage of queries are only simple selects (e.g. 75%/25% split for reads/writes). If you can arrange your application code to use two connections (or two sets of credentials), for reading/writing, HAProxy can take care of the details, routing, fail-over, etc.

listen pgsql_ro
    mode tcp
    balance leastconn
    # if you allow read connection to master
    # just use standard: option pgsql-check postgres_user
    option external-check
    external-check command /path/to/check_pg_in_recovery.sh
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    bind 127.0.0.1:6432
    server pg1 10.0.0.1:5432 check maxconn 100
    server pg2 10.0.0.2:5432 check maxconn 100
    server pg3 10.0.0.3:5432 check maxconn 100
    server pg4 10.0.0.4:5432 check maxconn 100
listen pgsql_rw
    mode tcp
    option external-check
    external-check command** /path/to/check_pg_not_in_recovery.sh
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    bind 127.0.0.1:7432
    server pg1 10.0.0.1:5432 check maxconn 100
    server pg2 10.0.0.2:5432 check maxconn 100
    server pg3 10.0.0.3:5432 check maxconn 100
    server pg4 10.0.0.4:5432 check maxconn 100

This is already a powerful configuration, using a pool of 4 servers. One of the servers is “primary”, while others are “in recovery” (i.e. replaying the Write Ahead Log, applying database transactions from the main server). They can still server read-only requests.

We use external-check script to run the simple SQL statement: “SELECT pg_is_in_recovery()”. HAProxy provides necessary environment variables to target the correct server. There are other ways to execute the check, HAProxy speaks binary for checks, so you can (ab)use that to your advantage. Even if you don’t know PostgreSQL protocol, you could record this exchange with tcpdump, and translate that to HAProxy config. Just for fun, we’ve implemented something similar with Lua (TBD).

A simple check_pg_in_recovery.sh could look like this (use .pgpass file for username/password):

#!/bin/sh
HOST="$3"
PORT="$4"
DATABASE="checkdb"
IN_RECOVERY="$( /usr/bin/psql -t -h $HOST -p $PORT -d $DATABASE -c 'select pg_is_in_recovery();' )"
if [ "$VALUE" = " t" ]; then
    exit 0
else
    exit 1
fi

One final step for this HAProxy setup is to teach our application code to speak to database through two connections, or simulate connection to two databases, like this implementation of multi-database idea for Django. pg_bouncer is a great tool to help you with this. pg_bouncer also knows many cool PostgreSQL tricks, like PAUSE command, where you can execute transparent server restart or fail-over, without any downtime.

Last, but not the least, there is an interesting pg_bouncer fork from AWS, which can be used to route queries according to custom rules (e.g. when you can’t change your app, but would like to split reads and writes).


The Club sandwich + Double PostgreSQL essay is part of a series of "recipes" that explore the ways of bulding reliable applications with (HA)Proxy.

PS. If you have any questions on any of the above outlined thoughts, feel free to share them in the comment section.

Click here to read Part 2.:TLS Calzone + Wrap Crispy JWT — Authentication on the Edge