Here’s something past me used to think: I write Java code, I work in a single language, life is easy and forever great! Later it was replaced by Scala, and a pletora of others were added with small side projects comming with the inevitable consulting tasks.
Yet, what took a good while to hit me, after leaving school, is that during my learning there had been a strong focus on programming languages but almost none on the tooling around them. Especially lacking was anything regarding the build system.1
Thus, my first year in the workforce involved following absolute antipatterns: copy-pasting pieces of pom.xml
until maven would yield what I wanted. Something that is obviously frowned upon for proper software, but somehow widely accepted when it concerns the mere tooling around software.
The present perspective is informed by the six years that followed.
Build Systems As First Class Languages
Completely ignoring the details of your build system will likely leave you with a batch of dark debt2 on your hands. For, in our world, every software project is getting older and forever growing, slowly spanning dozens of different languages and technologies.
About Build Systems
An observation:
- Build systems are programs that read some input and write some output.
- Programming consists of writing programs that may read something and output something.
- Ergo, configuring a build system is hardly different from programming in general: the result is a program that reads and outputs stuff.
This point of view might be slightly exagerated, but in terms of impact on the complexity landscape of an organisation, it provides us with a good upper bound. I strongly believe that a build system deserves to be considered a beast of its own, in a similar class as any programming language, at least when we consider the amount of things that may go wrong.
When you adopt a new language, you actually also need to adtop the tooling around it: how to compile, test and package it, how to manage dependencies, etc. These come at an additional cost.3
This may sound like overthinking things for small organizations, but (very quickly) makes sense for larger ones, as they are very likely to be using more than one programming language, which is where the curse of cardinality quickly comes into play.
An Example
Let’s take as an example a company that’s been around for one or two dozens of years. Its line of business requires it to write some C/C++, it relies on some Perl for various scripting tasks, customer facing APIs are written in Java, the UI relies on some javascript and is being migrated to Typescript and recent experiments involving data science motivated its engineers to try some Python and some Scala. All in all a plausible scenario that leaves us with eight different languages
Language | Build System |
---|---|
Java | Maven |
C/C++ | cmake |
Perl | built in |
JavaScript | webpack |
Scala | sbt |
Typescript | Babel |
Python | built in |
Now, eight languages is already a lot and generates some challenges: not only does the organizatio require that someone at least has a modicum of experience in each of them, they also need the know-how to operate the respective build systems.
Considering the build tooling as a separate language really drives home the point that your tooling cardinality is exploding, as you’re actually dealing with fifteen very different beasts here!
This has an impact when a Java developper needs to touch a Scala project, for example. They’ll additionally need to adapt to a new build system. Same for any other language combination.4
Bazel to the rescue
As we’ve seen above, under the classic one language and its entire default toolchain adoption policy, the overall number of “languages” grows in about 2n
. Linear isn’t too bad a complexity, but given that developpers are, all in all, pretty expensive (and have limited mental space themselves) it would be nice if this could be smaller.
That’s were Bazel can, pretty simply, turn this 2n
into an n + 1
: in the table above, it’s possible to replace the entire right column with Bazel.
Granted, this is no easy task, and some specific knowledge about how languages are built is still necessary. Yet, it provides you with a common way of thinking about your entire build process, and this is extremely powerful when you want to scale up your development process.
This concludes the introduction to the notion of the curse of tooling cardinality and how Bazel can help address it.
May your builds be swift and correct!
-
For the purpose of this post, a build system is anything that transforms your code into something useful: that generally means compiling it to produce a binary and possibly testing it, while also somehow providing the required dependencies. ↩︎
-
I discovered the term dark debt in this Bazel Conf talk by Charles Essien. As opposed to tech debt, a form of known unknown in your project, dark debt corresponds to the deeper architectural flaws you are probably unaware of. ↩︎
-
let’s illustrate with an example: “we just use Java” in your company really means “We build our java and handle dependencies with maven”. You may not care at all about the intricacies of the build system, but someone in your org' (hopefully) does. ↩︎
-
You may think but that should not happen anyways: why would a Java dev start writing some C code? Obviously this should not happen too often, but making sure it is possible can yield huge benefits. More on that on another day when I find some time to write about organizational silos. ↩︎