Design principles

The design principles of a reference architecture

This article aims to explain the theoretical design principles McKinsey applied when designing the reference architecture you’re about to test (first proposed in their talk at PlatformCon 2023).

If you’d like to see how the architectural components fit together, fast forward to Structure and integration points.

To go ahead and start with the first tutorial from a developer’s perspective, head to Scaffolding a new Workload. Or to access tutorials from a platform engineer’s perspective, head to Provisioning a Redis cluster.

Golden paths over cages

The reference architectures applied smart abstractions that lower cognitive load and drive standardization, and that were centered around the workload as the primary entity. A typical abstraction would say “my workload depends on a database of type postgres”. Abstractions are resolved against executable configurations in a modular and dynamic way. Modular because each sub-entity (the database of type postgres that depends on the respective workload) is individually configured. And dynamic because the creation of executable configurations happens dynamically with every deployment following the approach of Dynamic Configuration Management.

The abstraction is layered so the user can, but is not obliged to, follow them. Using the abstraction, e.g. following the golden path, produces reliability, low cognitive load, and a high degree of security and standardization. Yet the user is free to leave this path and alter the lower level configurations should the security posture allow this. Whenever we abstract we do not do this at the expense of context. The abstraction always allows the view of why configs were created and how they were created. A platform should never be a black box.

Platform as a Product

An Internal Developer Platform (IDP) is something that’s highly bespoke and depends on a number of technological and cultural factors specific to the organization. The design of these reference architectures provides a starting point but leaves the flexibility to localize the design as per your organization’s needs. By using this starting point, platform teams should be able to treat their platform as a product and apply product design principles.

Everything as code

From the configuration of the platform to the way app- and infrastructure configurations are specified and stored, the design of the reference architectures is code-first at its core. This allows for disaster recovery, versioning, and structured product development principles. While the architectural design offers interface choice and features a UI, CLI, or API, the primary developer interaction should also be one of code-first. A developer should be able to perform 90% of all tasks from the IDE and without experiencing media disruption.

Brownfield compatibility

The modern enterprise is a.) brownfield and b.) multi-everything. The integration of the status quo should be fast, seamless, and not require major change effort. The design should assume a situation where the team is confronted with several CI systems, registries, IaC format, and clouds.

80/20 platforming

Good platform design does not try to cover every technology on earth but instead defines standards for a subset very well. These reference architectures are optimized for workloads running in containers on Kubernetes. On the resource side (databases, file storage, and DNS) it covers anything that has a consumable API, wherever it’s hosted.

It’s important to remember that the reference architectures do not try to convince every developer in your user base, and you shouldn’t assume that you’ll be able to please 100% of your developers. Instead, achieving 80% is a great win. Your kernel hacker who wants to do everything in C++ should be able to work around the platform, they just won’t get the same SLAs from you as the platform team.

Next steps

Dive one level deeper into reference architectures by reading up on Structure and integration points.