Security And Type Safety: Abusing Your Compiler Is A Good Thing To Do.

2021/10/12

Properly securing services is no trivial task in the modern and glorious adventure that the daily life of a backend developper represents. An aspect that is often forgotten in this area is the benefit a strong type system can provide.

As a reminder, a type system, coupled to a compiler, essentially lets you do (almost) mindless changes to your codebase and will tell you when you’re doing something wrong.

Slight exageration, yes, but sufficiently true. The question is: can we benefit from this in matters related to security?

Typing Everything

One important aspect of security revolves around authorization. Simply put: is this requester allowed to do whatever they are doing? That’s usually not too hard to figure out: a cutomer’s users are only allowed to see stuff from that customer, only admins can edit things, etc.

However, the life of a developper is not entirely focused on security: there are deadlines to meet and products to ship, so sometimes they can forget about the checks, not do them properly or maybe even forward the wrong query parameters to the downstream logic – aka, introduce bugs!

That’s were properly leveraging the type system can make certain kinds of errors a practical impossibility.

The Numerical ID

It’s not uncommon to identify users with a number, say a long. Because we’re in the enterprise world, the user probably belongs to a customer (also identified with a long) and is in a certain role that might be identified with a number, too.

So, for any kind of service it’s likely that, internaly, it’s going to call some functions or another service where these identifiers will be used.

For example, if a user’s profile needs to be looked up, a function getUserProfile(userId: long) might be called.

Now, obviously not everyone should be able to look at any user profile: by default you should only be able to see your own profile, and an organisation’s administrator might be able to see every profile in their own organisation, but not more.

Assuming we have a service that returns user’s profiles, we may have some logic somewhere doing the following:

def serveUserProfileQuery(
    requesterUserId:     long,
    requesterCustomerId: long,
    requesterRoleId:     long,
    requestedUserId:     long
  ): Response = {

        // Check if the requested profile belongs to the same customer as the caller
        if (lookupCustomer(requestedProfileId) != customerId) {
		return 404		
        }
	// Same customer: are we dealing with an admin, 
        // or is the caller looking at his own profile?
        if (!(isAdmin(roleId) || userId == requestedProfileId)) {
		// If not, no can do
 		return 404
	} 

	return lookupProfileFromDb(requestedProfileUserId)

}

The above is pretty simple, and we can already see that there are some bugs that would not be catched by the compiler: for example, if any of the arguments to serveUserProfileQuery are erroneously mixed up, the service would obviously return wrong things, but the compiler would be none the wiser. Furthermore, isAdmin(userId) can even look correct to a human reader, while isAdmin(customerId) compiles even if it looks strange.

That’s were a first benefit of type maximalism pops up: if there is the slightest possibility of confusion, such as mixing up arguments, make the confusion impossible by adding more specialised types! In our case, we could define:

Each of them is a glorified long, but now you cannot mix them up anymore, or at least you cannot do so while having a build that passes.

I cannot stress enough how well this approach works: it makes an entire class of bugs much harder to run into. But it’s only a beginning.

Limiting Instantiation

Now, you may say How does this help me? I still need to create an instance of said type and have ample occasion to make mistakes there as well: indeed, what prevents you from doing new CustomerId(userId) and getting this reviewed by a tired intern?

Here comes the second half of the trick: said instantiation, and generally all other work that involves extracting these kinds of meta-data from a query, should be moved to somewhere else.

This way it can:

That is: you could make the constructors for all identifier types private or protected in a way that will only let them be called from the well-tested and well-maintained query processing library, while said library will make only the richer types available to downstream consumers.1

With this idea, the example above becomes:

def serveUserProfileQuery(
    requesterUserId:     UserId,
    requesterCustomerId: CustomerId,
    requesterRoleId:     RoleId,
    requestedUserId:     UserId
  ): Response = {
  ... 
}

There’s still an opportunity for a mixup, obviously, which can be solved by thinking a little bit more about how to structure and separate the different types.

Conclusion

Security does not end with a few types of bugs being weeded out, clearly, but this approach illustrates how you can push certain things onto your compiler: I’ve been rather over-typing most of my (production) code for years and have not once regretted it.

More generally, each time a bug is encountered, whether or not it has security implications, asking the following question can very often yield valuable insights:

What can we change in our type or code structure to make this problem less likely to appear or even impossible?

Sometimes it’s just a matter of adding a few trivial types.


  1. If, for some reason, restricting access is not an option, you still have the possibility of giving hard-to-mistake names to your methods: Eg, dontUseMeOutsideOfTheSecurityPackageCreateUserId(...) or imSureThisIsAProperCustomerId(...), hopefully they will ring a bell when they are used in contexts where they should not… ↩︎