Why data?

If you are preparing to drive to/from work, are doing the dishes, or have some spare moments, listen to the Farnam Street podcast with Atul Gawande. There are incredible insights into complex systems and people waiting in that interview. There is one quote in particular that I want to unpack.

When we all have a piece of care or a piece of a problem, very often none of us can actually see what the outcome is and the owner can’t see the function of the system. So then you start finding things like data really matter.

―Atul Gawande

There is plenty of buzz in the technical community on the value of data. Data delivers business value. It is the key to your customers innermost desires. Right?

If you chose to adopt that viewpoint, you are destined to create increasingly pervasive methods of extraction. Methods that lead to morally confusing twists.

So if we yearn for data not to lift the curtain of human desire, why are people so obsessed with data? Data is the feedback from your systems. Data is our best representation of system behavior as we know it. Data provides insight to the response of the system under a given stimuli.

We devise experiments to provide an alternative experience. We introduce metrics intended to quantify some improvable attribute. We optimize our systems and thought processes on those metrics. We introduce an alternate stimuli and measure the response. This is the basis of human understanding as introduced by the scientific method. Empirical observations paired with objective measurements stem from there.

In a digital world, everything is make believe. There is no concept as an objective measurement.

Think about this. Every single piece of data collected in the digital universe was encoded in software by an engineer. On a technical level, each of those engineers are performing a similar task. Store some information at a certain point in time. But what highly varies is what information is collected and how. There is no basis of physical reality to determine what things are relevant to capture or how to properly detect them. Each objective measurement is inherently biased by the implementer.

Your system includes people. People are infinitely complex systems. We try and we try to imprint their personalities and stories onto their digital representation. We expand our scope of collection to cover more facets of their identities. Intrusive extraction and digital surveillance become the norm.

If you are responsible for any aspect of personal data at your organization, you can do better. Understand that data is not a silver bullet. More is not better. Define your metrics and targets before collecting. Avoid the temptation to collect additional data for incremental improvements. Look for ways to collect less data and achieve the same outcomes.

Data is one of our most important tools for creating knowledge. But in the digital world we are both the creator and observer of data. In the physical world, our observation methods can be independently confirmed through comparable experiments and how well the observations match reality. Real-world observations can be verified and calibrated. Digital observations are encoded with as much messiness as the subject themselves.