Making Heroku Preboot Less Dumb

Heroku's Preboot feature is great at enabling seamless deployments, but it was lacking a smoke test... here's how we made it work harder for us.

Heroku’s Preboot feature is great, it enables seamless deployments on Heroku with a single command:

heroku features: enable preboot

Without Preboot downtime would be required on every release.

Preboot works by creating a new set of web dynos in the background for each release, after ~3 minutes the router then switches traffic to the new dynos. That 3 minute window gives your app servers enough time to boot and be ready to service requests, resulting in 0 downtime deployments!

However, preboot is pretty dumb

It does nothing to check the health of the new dynos. Once the 3 minute timer is up requests will be routed to your new dynos regardless of the state they’re in, even if they’ve crashed.

Ideally in a Blue-Green deployment system you have some kind of smoke test on the new release before switching traffic to it.

Solution: use the Release Phase to perform a smoke test

The release phase is used to perform commands before each release is deployed, traditionally things like database migrations. Most importantly if the release command fails, the deployment is halted. We can take advantage of that fact to perform a health check on every release.

Here’s a snippet taken from our release command:

export PORT=3210
export HOST=127.0.0.1

# Boot a web server in the running locally in the release dyno
# This should be the same as the 'web' command in your Procfile
(bin/start-nginx bundle exec puma -C config/puma.rb) &
SERVER_PID=$!

# Wait for the server to boot and for it to be listening for requests on the port
while ! nc -z $HOST $PORT -w 1; do
  # Check to see if the server process is still running or if it has crashed
  if ! kill -0 $SERVER_PID; then
    exit 1
  fi
  sleep 1;
done

# Once the server is up attempt to make a request to the `health_check` endpoint
response=$(curl http://$HOST:$PORT/health_check)

# Check to see if the response body is 'OKAY'
if [[ "$response" != "OKAY" ]]; then
  echo '[Health Check]: FAILED'
  echo $response

  # Cleanup the localhost server
  kill $SERVER_PID
  exit 1
fi

echo '[Health Check]: PASSED'
kill $SERVER_PID
exit 0

You can see from the snippet we are able to boot a server inside the release command and attempt to make a request to a ‘health_check’ endpoint. If the server doesn’t boot or the request returns an error the release command fails simultaneously preventing the deployment and saving our bacon.


Author

This post is written by Nick Maher, an engineer in the Childcare Development squad at Koru Kids.