Platform lesson #8: Instrument your platform for data-driven decisions
William Edwards Demming, the American who helped Japan rebuild itself after World War II, famously said: “In God we trust; all others must bring data.” This is still a lesson most companies haven’t fully incorporated. Once a platform gets a certain amount of traction, the opportunity to make data-driven decisions presents itself. This is incredibly important as it allows for much higher-quality decision-making than is possible with opinions or qualitative data (what customers said). The challenge is that in many platforms, the architects never spent much time thinking about instrumenting the platform with data collection capabilities. As a consequence, the platform has limited, if any, data collection built-in.
When there’s no data available, many decisions are made without much evidence, purely based on beliefs and earlier experiences from key decision-makers. One reason is that it’s often hard to collect the data post-hoc. As a consequence, most companies that I work with are unable to answer basic questions concerning feature usage in their platform. How do you prioritize R&D resources if you don’t know whether the features you’ve already built are even used? And if you do know, do you then also know who’s using what features so you can do a proper segmentation of your customer base?
Moreover, when data gets collected, in many cases it’s the wrong data. Engineers tend to focus on basic quality data that allows them to verify that specific features actually work in testbeds and in the field, but this data doesn’t contain the right information for strategic decision-making about features, platform boundaries and customer segmentation.
The key lesson is that platforms need to be instrumented to facilitate data-driven decisions. This requires proactive and early thinking about the categories of questions we want answered about the platform and the resultant data that should be collected to ensure that we can answer these questions. This data tends to be much more concerned with customer benefits and KPIs rather than functional correctness.
The second lesson, however, is that it’s impossible to predict all the questions that could be asked about the platform. It should therefore be possible to easily extend the platform with new instrumentation when needed. This typically requires a “data fabric” layer in the architecture overlaying the functionality so that it’s easy to insert “probes” into different parts of the system for data collection. The challenge is, of course, that this typically requires some form of DevOps to be present allowing updates to the software to be pushed out to extend the data collection functionality.
In systems where updating the software is prohibitively expensive, eg because of certification issues or high cost of updating software, one strategy that I’ve seen used is to create an approach where everything can be measured, but data collection is turned off by default. When certain data is required, collection can be turned on through configuration (changing a parameter setting). This avoids the need to update the software when new information needs arise but requires even more clairvoyance on the likely data needs.
Taking things one step further, we can see the first signs in several industries that the physical products are commoditizing and the value is shifting towards data generated by them. This is a completely new viewpoint for many product people, but it only reinforces the need to think carefully about instrumentation, data collection and data aggregation. Imagine a situation where your product is nothing but a vehicle for the data collected through it. How would that change the way you architect it?
The initial focus with platforms tends to be on getting the functionality to a point where users will adopt the platform. Consequently, instrumentation tends to not be in focus and we easily end up in a situation where many decisions need to be made based on opinions, rather than data, which of course leads to lower-quality decision-making. So, remember Edwards Deming and require everyone to bring data.