AI Agent

Data Tips #19 – Data Quality 4. Practical examples of the impact of data quality

We have discussed data quality in some different dimensions and angles in previous articles.

It does get a bit abstract at times so today we will look at some practical examples of when data quality matters.

So, to start off. Let us take an easy case that has a very direct impact.

Example 1: Old customer adress in the CRM database.

Direct effect: This error in the data may lead to orders being delivered to the wrong place. (We normally require customers to verify delivery address at checkout to avoid this)

Indirect effect: Where a customer lives is often one of the parameters when building the right offers to the customer. Based on nearby stores, demographics in the area, income levels in the area, etc. Having the wrong address may affect the offers sent.

Example 2: The data is inconsistently defined in different systems. My personal favourite is when Nasa & Lockheed Martin used different unit of measurements for the Martian Climate Orbiter probe. (Metric vs English units)

Direct effect: The probe missed the entry angle and was scorched, crippled and hurled into space.

Example 3: Item incorrectly placed in an item hierarchy. (Green Apples placed in the “Hard Bread” section rather than “Fruits)

Direct effect: The item may be hard to find in the operational systems, for instance in the ordering system. This may lead to internal customer service complaints and may also lead to the item not being ordered as much as it should.

Indirect effect: When measuring sales afterwards, the “Hard Bread” category will also have received the sales for Green Apples. This will lead to inaccurate sales followup and may in the end lead to bad decisions when balancing between categories.

Example 4: Item hierarchy not cleaned up. (items marked as inactive not part of rearrangement of item hierarchy)

This is a tricky one because on the surface this should not have any direct effect. It may however severely affect decisions when making predictions based on historic data.

Indirect effect: Say that you are revising your assortment. And that you are building the prediction models based on customer behaviour within the category you are revising. In this example you have “active” items that are in the right category both now and in the past. “Inactive” items are in the old category. This may lead to the model being only based on “active” items since they are the only ones in the new category and all of the behaviour connected to “inactive” items has been lost.

So, to summarize, there are many different examples of data quality and when it can have a direct effect and an indirect effect.

As you can see above, some of the examples cost little (such as inaccurate reporting) and sometimes it costs 125 Milllion dollars and years of work lost. The hard part is finding where to spend effort.

Share on social media: