Mature organizations are aware of data quality issues and recognize the need to address them. They might take different approaches and use different tools, but they share a common readiness to act in ways that preserve data quality.
Today, data is a strategic asset that’s vital for every business to drive decision-making, identify risks and opportunities, and sharpen their competitive edge. It’s used in every department and at almost every level, but as analytics capabilities have surged forward, data quality issues often hold people back.
In fact, data quality has decreased over time, mainly because more data is now being collected from more sources and with less centralized oversight. The number of data quality incidents reported by organizations rose from an average of 59 in 2022 to 67 in 2023, according to one industry study. Another study found that 70% of respondents who don’t trust their company’s data say that data quality is the biggest issue.
Low-quality data can’t be used for analysis, and if it is, it produces inaccurate results. Raising the quality of data can be the difference between success and failure. More than one-third of business leaders agree that improving trust in data increases competitiveness and differentiation, enabling better responses to market changes.
Without this trust, employees waste time tracking down new data, business leaders make poor decisions, sales, and marketing teams miss revenue opportunities, finance, and compliance teams can make mistakes that incur fines for non-compliance, and the list goes on.
Not surprisingly, all this is fueling a lot of concern about how to improve data quality. In my experience, I’ve noticed that mature organizations go about ensuring data quality in different ways from non-mature organizations. Both types of organizations profess concern about the issue, but some take it more seriously than others.
Data Quality Strategies
Here are some of the main approaches to data quality that differentiate the leaders from the laggards.
They cover the data quality basics
Many of the important points affecting data quality are common sense, even obvious. For example, you need to know which data points matter and ensure that your software is configured to capture them. Vital data needs to be collected consistently and coherently, cleaned and verified on arrival, and regularly profiled and monitored. Often, designated “data stewards” are appointed, adding a layer of accountability.
Mature organizations recognize these problems and put technologies and ideas into place to ensure that they are addressed. They don’t just define protocols; they build a game plan to enforce them, and they stick to it. As a result, most of their crucial data is useful, correct, and up to date.
It’s worth pointing out that even the most data-oriented enterprises don’t necessarily have everything fixed. Data quality incidents can and do still occur. But they know where they are along the road, most of their data is trustworthy, and data quality is prized within the organization.
They address the gluing problem
Enterprise data systems tend to be complex and decentralized. Multiple systems capture similar data from multiple locations and store it in various formats and contexts. As a result, overlaps and duplicate data occur frequently, and there are a lot of incongruences. All that data needs to be glued together into a singular master.
Often, organizations try to fix this issue by introducing a data lake. However, few of them invest time and energy in homogenizing and synchronizing the data between different datasets, so this step generally fails. Sure, the data is now all in one location, but the problem persists in the form of miniature sandboxes within the data lake.
Mature organizations put in the legwork to promote the concept of a single source of truth. They establish initiatives like master data management, data governance protocols, data catalogs, and a BI center of excellence. Frequently, they create the role of a database administrator who has both the responsibility and authority to homogenize the data into a single source of truth. While they don’t have a silver bullet, they do deal with it.
They take the metrics layer seriously
Once data is cleaned and preprocessed, quality can still be affected by inconsistent application of business logic, or what Gartner calls the “metrics layer.” This is the layer that lies between data preprocessing and analytics, where different business teams might apply their own formulas for turning raw data into metrics.
For example, you might have five different departments, all using the same datasets, each of them calculating the data differently and putting weight on different issues. It’s becoming a bigger problem as self-service BI spreads, but many organizations don’t take it seriously. In my company, we’ve tackled this issue head-on by developing a smart engine that can handle different layers of business logic that process raw data and make it more consistent.
Mature organizations understand the need for a protocol that ensures formulas are calculated consistently, maximizing its usefulness to users across departments. They don’t promote self-service at the expense of a single source of truth, and they seek out mechanisms for standardizing metrics.
Mature Data Organizations Trust Their Analyses
When it comes to trust in data, the real difference between mature organizations and those that are still struggling is one of mindset. Mature organizations are aware of data quality issues and recognize the need to address them. They might take different approaches and use different tools, but they share a common readiness to act in ways that preserve data quality. When you can trust your raw information and how it’s processed, then you can rely on the strategic decisions revealed by your analyses.