Structural Correction

Chapter 12: Data as Outputs, Not a Shared Substrate

Treating data as a shared substrate destroys the autonomy that process ownership requires. Data contracts allow each unit to own its data while publishing a versioned specification of what it provides to the reporting layer.

A data analyst in the central reporting team at a Nordic insurer spent three weeks building the quarterly capital allocation report for the CFO. The report required revenue-per-customer figures across three business lines: motor, property, and commercial. He pulled the numbers from each division's data warehouse. They did not reconcile.

For instance, Motor counted a customer as a policyholder, Property counted a customer as a household, and Commercial counted a customer as a legal entity with an active contract in the trailing twelve months. As a result of these distinct definitions, a single corporate client with a fleet policy, a building policy, and a liability policy might appear as three customers in one view, one customer in another, and either one or zero in the third, depending entirely on their contract renewal date.

He spent four days in meetings with divisional data owners trying to agree on a common definition, but they could not, as each definition was entirely correct within its own process. He ultimately built a reconciliation layer in a spreadsheet that mapped between the three definitions using manual rules, resulting in an “assumptions” tab that ran to forty rows.

The report shipped two weeks late. The CFO used it to allocate € 12 million in discretionary capital across the three divisions. The numbers he compared were built on three incompatible definitions of the unit they measured. Nobody told him about the discrepancy, and nobody needed to. The report looked precise because it inherited the precision of the source systems, which agreed on the number of decimal places but not on what they were actually counting. The CFO allocated € 12 million on numbers that nobody could reconcile. The mechanism that produced that failure is the subject of this chapter. The board implication is simple: a shared data model is a shared point of failure.

This is what happens when data is treated as a shared substrate rather than a published product.

In large corporates, data is treated as a shared resource: a central team, a common model, a central repository. This feels disciplined. It is the fastest way to destroy the autonomy that the previous chapters described.

...

Continue reading in the interactive reader

Read this chapter

← Chapter 11 Chapter 13 →