Automate the boring stuff

This post is about my recent experience building some developer tooling, and how it actually has saved me some precious time!

To set the scene, it’s 4PM on a Tuesday. In sixty minutes I will sit down with our entire Customer Success team and walk them through our internal changelog for the last week.

We like this session. I maintain visibility over everything my team gets done, and the extremely busy CS team can keep themselves abreast of everything being changed in our product. It’s also a great forum for scrutiny, questions and new ideas.

In light of that, like the CS team, I myself am pretty time poor. That sixty minute window constitutes the small gap in my week where I go through all of the changes we’ve deployed, ensure I understand exactly what they are by reading pull requests, judge whether they actually belong in our internal changelog (aka does anybody in the business really care that we bumped the Rubocop gem last Thursday…), and then actually add those entries in a consistent format so our internal teams can follow along with regular product updates.

Generally it takes the full sixty minutes, it still feels rushed, it’s very manual, and I frankly dread doing it each week. There must be a better way!

Following some shared frustration about this process with my CTO, a candid and wonderfully puzzled expression from a recent new joiner (you mean this isn’t automated?!), and all the giddy excitement about AI currently, I decided that it was time to solve this little issue, and get some time back.

Ok, so what do I actually want to achieve here? Automate the creation of internal changelog entries when we deploy to production.

For starters, I need a way of receiving updates from Cloud66 (our PaaS) after a deployment occurs. I then need to be able to use that update to determine what exactly has been deployed. Finally, I need to use that information to create changelog entries in Fibery (our project management platform, which is also the data source for our internal changelog views).

All of this I was able to automate by using Cloud66 deploy hooks, Github Actions, and a couple of reasonably small custom scripts.

Cloud66 offers us lifecycle “hook points” - after_checkout, after_bundle, last_thing etc - and we can attach bash scripts to them.

When a deployment starts Cloud66 walks through these hook points in order, checks a deploy_hooks.yml file in our codebase for any scripts that have been registered, and then runs them. In our case I’ve made use of the last_thing hook point, and defined a post_deployment.sh bash script, that Cloud66 runs in a virtual environment.

This post_deployment.sh script for all intents and purposes builds a payload containing all information about the deployment (git SHA, commit message, deployer, etc.), and then makes a POST request to a Github API endpoint, which is what will then trigger a Github Actions workflow.

To very briefly try and contextualise this bit in slightly more detail, in our codebase, we have workflow files sitting in .github/workflows/. These are just YAML files that tell GitHub “when X happens, do Y”.

One of those files essentially says “when a post_deployment_production event is received by a specific Github endpoint, run these steps”. So when Cloud66 makes that POST request, GitHub receives it and runs whatever steps that file defines.

That file, for us, is called deployment_hooks.yml, and the steps defined therein instruct Github to spin up a fresh Ubuntu virtual machine (existing only for the duration of the workflow run), install some dependencies, and run a script called postDeploymentChangelog.ts with the Cloud66 payload passed in.

It is in this postDeploymentChangelog.ts file where I’ve written what you might describe as the “application logic” for this little workflow. This file is responsible for ingesting the Cloud66 payload, working out exactly what work was deployed, and creating changelog entries in Fibery.

It’s really just any old script. It starts by connecting to the Github API using the octokit library. Using that same library I extract all of the pull requests that were part of this master branch deploy, and it does that by using the git SHA in the Cloud66 payload.

Specifically, using that SHA we’re able to walks backwards through the commit history using the Github API. The code is essentially doing the following:

Deployed git SHA → “which PR was this the merge commit for?” → finds the parent PR
Parent PR → “what commits are inside this PR?” → compares base to head
For each commit inside → “is this commit the merge commit for some other PR?” → finds the sub-PRs

At the end of all this, we now have all PRs that were deployed.

I technically have enough information at this point to start creating changelog items in Fibery using their API - each of the PRs will have a name.

But hold fire. Our various individual engineering teams use slightly different naming conventions for their PRs (some use branches, some use ticket IDs, and some use good ol’ natural language). Remember I’m trying to save myself time here, and the last thing I want is for consumers of the changelog (myself included) to be presented with some cryptic title and then have to follow up and ask what it means. What can we do!?

Luckily for me, there are these marvellous Large Language Models available en masse today, and summarising changes using PR titles and descriptions seems like the perfect task for them.

Accordingly, in my postDeploymentChangelog.ts script from before, I then connect to the OpenAI API using the openai lib, and task an LLM with generating changelog entries by summarising title/description text I can extract from PRs, using octokit.

The result is a structured and consistent collection of data that I can then pass on to the final stage of this script: using Fibery’s API to build my changelog.

There really isn’t much to this final part. I’ve got all the information I need to programmatically create changelog items in Fibery: a title, a description, and a category type (chore, bug or feature). The script simply loops over my extracted PR data, builds a changelog item, and pushes that into Fibery.

I’m excited about this entire workflow because it all happens behind the scenes, whenever we deploy.

I’ve actually configured the creation of Fibery changelog items in an ‘unpublished’ state. This is so that I can spend 5–10 minutes ahead of my weekly CS catch-up quickly reviewing the last week, publishing only those entries that really matter to the rest of the business. I know I can skip past items that are, for example, obviously dependency updates, and publish only those bugfixes and features that are of interest and importance to all our internal stakeholders.

Saving the best part of an hour each week certainly shouldn’t be overlooked, and building this tool has given me renewed drive to automate the boring stuff and question what else we as a team might have just accepted as ‘the way things are right now’.

Thanks for reading. If you’ve got your own tedious process begging to be automated, I hope this gave you some ideas, or at least the push to start looking!