Another recap of a week’s worth of links, news, and discussion around technical leadership and technology; as usual, follow me of Linkedin if you want to receive a notification when I share a new link.
What I have been writing
How your delivery pipeline will become your next big legacy-code challenge got me thinking: is the code we develop in our CI/CD pipelines of the same quality of our applications code? My experience tells me that no, it’s not even close.
I can think of many excuses for that (“infrastructure is hard”, “lack of skills”, “need to get this done quickly as it’s blocking the delivery”, “tools are mostly broken anyway”, “still better than writing everything in bash”), but the truth is that there’s no excuse for that. Even worse, if, as I keep repeating like a broken record, many of the open-source/commercial tools we use in our CI/CD pipelines are heavy, overcomplicated, buggy and ill-designed, having spaghetti-coded pipelines will prevent us to migrate to better ones when they become available.
Unfortunately, there’s no standard allowing CI/CD portability (there is a CD Foundation, but I’m not sure of its practical goals yet); however, I think there are some general principles we can follow to make sure our pipelines are better designed and easily portable to a wide array of tools.
I’d like to talk about it in more details in a future post, but these some of the ideas I’ve come up so far
- the CI/CD orchestration (think: Jenkins, TeamCity, GoCD, Concourse, …) tool should deal exclusively with the orchestration of tasks (stages) — it’s in practice a simple workflow engine (let’s call it pipeline orchestrator in the following)
- the actual implementation of these tasks (build, test, packaging, deployments, …) is not a part of the pipeline orchestrator but delegated to specific tools/technologies. These are external tools, not orchestrator plugins
- pipeline definitions are completely in code; the UI of the pipeline orchestrator is read-only. A seed job will populate the pipeline orchestrator with the pipeline definitions
- the setup of the pipeline orchestrator is also only in code
- password, secrets, credentials needed by tasks are stored in external, specialized tools
- tasks are executed in containers by default
- results of task processing are stored externally to the pipeline orchestrator
- dumb pipelines, strong scripts: as the pipeline orchestrator is just a workflow engine, develop powerful and portable scripts (in Python or Go for example) to perform complex logic within tasks
- mandatory versioning: every single component of your CI/CD pipeline must be versioned and referred to (invoked to) always with name and version (pipeline orchestrator, pipeline code, tasks, scripts, secrets, configurations, …)
As I’m writing this, what emerges is the overarching goal of defining precise boundaries and limited responsibilities for different parts of it, a sort of Unix philosophy for CI/CD. With that in place, changing part of it, or the whole orchestrator, will be a limited and less scary work.
What do you think?
Topics I talked about this week
I started the week with a contrarian article on the monoliths vs microservices debate: 3 Reasons to build Monolithic Systems, which comes with an even bigger list of similar articles cited in the comments. I’m not convinced: I wouldn’t want a monolith for anything rather than a throw-away project. I might compromise for something like the microliths Starling Bank has mentioned in recent presentations.
Another controversy: What’s all the buzz about the Monorepo? According to the article, its advantages are
- Help enforcing standards across projects
- Decrease effort of maintaining shared libraries
- Easier integrated end to end testing
I’m having none of that. I can only see how hard and hacky will be your versioning and dependency management systems, for no real advantage. Am I missing anything?
Three more articles on serverless. The best one is the Journey to Serverless at Comic Relief. A very detailed story of their migration from PHP to serverless, with tons of links to other articles and tools used. Efficient APIs with GraphQL and Serverless talks about its combination with GraphQL, while Mitigating serverless lock-in fears makes some interesting points about the fear of lock-in, and how it should always be weighed against the productivity and time-to-market gains given by vendor tools.
Scaling Engineering Teams via Writing Things Down and Sharing — aka RFCs: another article about adopting RFCs (Request for Comments) as a company tool. I’ve been asked whether RFCs are agile tools or the opposite of it; an interesting question! Like any tool, I suppose they can be used in agile contexts or as part of super-bureaucratic processes; arguably, some of the original RFCs have been used while designing-by-committee specifications, but that doesn’t need to be the case. In my mind, and in the way I apply them, they are a very lean way to propose new ideas and discuss them in an asynchronous way, which suits the distributed nature of modern organizations.
I love technology history articles, and I loved What Really Happened with Vista: An Insider’s Retrospective. Organizational dysfunctions, politics, lack of continuous integration make their appearance as the usual suspects.