Data fabric and data mesh both strive to bring organization to the data that is spread across the databases or data lakes. Data fabric is very technology-centric, and data mesh focuses on organizational changes.
Every data-first company strives to or is already in the process of adopting a self-service business intelligence model. A lot of these companies are still not in a position to make their data fully accessible across their platform and scale it across to all their users across different verticals. For these companies, data that sits in siloes in a data warehouse or a data lake with no to limited facilitating capabilities as and when the teams require. Here is where data technologies like data mesh and data fabric come into play.
Enable cookies
so you do not have to refill the form on future articles.
When looked at superficially, both might look fundamentally similar. After all, meshes come into existence from fabrics. Considering their impact on any IT system, it might be worthwhile to learn the difference between these two offerings to identify the right product fit for your organization. In many cases, finding the best of both worlds, an entity-centric data fabric can incorporate the data product concepts of data mesh, and the decentralization of data engineering might just be what the organization needs.
Noel Yuhanna, a Forrester analyst, was one of the first people to bring a definition to data fabric. Data management tools have come a long way from databases to data warehouses, then data lakes, depending upon the complexity of business solutions. Data fabric can be considered the logical next step in the data management process.
Data fabric is a metadata-driven process at its core, where it aims to connect a wide array of data sources and tools in a united and self-service manner. As the sizes of the data stored in the organizations keep increasing, the number of silos that hold this data as well increases. The type of data also widely varies in the way that it could be transactional or operational data.
With data fabric deployed over these repositories, data lakes, or warehouses, it brings about clarity in terms of the centralization of data across the organization. It makes data provisioning easier for the consumers downstream, be it data engineers, QA management engineers, or analysts. It should be noted, though, that while the management of this data is centralized, the access locations remain the same.
Data fabric is the #1 Gartner Technology Trend for 2022, owing to its capabilities of recovering nearly 70% of developer work in the data lifecycle. Unlike manual processes of integration, data fabric offers a significant advantage with the benefits derived from processing this data.
Zhamak Dehghani, a ThoughtWorks consultant, first defined the concept of data mesh. Fundamentally it tries to solve the same problem that data fabric aims to- the management of data that is siloed across the organization. But it is different in the way that in a data mesh, distributed teams can have control and access to manage their data in their silos per their discretion.
The reason for the push toward data mesh is to solve the sync issues between the data lakes and the data warehouses. The logical architecture put forth by Dehghani focuses on the data being filtered based on data that is commonly shared across the users and the data sources instead of hardcoding it for transformation. In a data mesh, the data is maintained about the same format as the source, and this data is then taken by domain-specific teams to mold it into a data product as they see fit.
The main advantage that the data mesh offers is that the self-service Infrastructure-as-a-Platform provides the teams that requisition data along with monitoring, logging, alerting, and standardization- all with a standard process that is the same across the board and which is also domain agnostic.
Data meshes Vs. data fabrics
To summarize, both data fabrics and meshes are data management architectures. The difference between them is that data fabric is a framework that is tech agnostic that can deliver data products as one of its many outputs, while data mesh is an architecture that only produces data products that are specific to business domains.
Data fabric and data mesh both strive to bring organization to the data that is spread across the databases or data lakes, data fabric is very technology-centric, and data mesh focuses on organizational changes. Mesh depends on people and teams for the change in organizational changes, and the fabric is an architectural approach to handle complex data and metadata.
In terms of how they are designed, data fabric makes use of the metadata and the centralized data engineering according to the overall experience of the data consumers in the organization, while data mesh uses expertise that the teams have across various domains to create and design its deliverable: a business-oriented data product.
In the words of Yuhanna, “A data mesh is basically an API-driven [solution] for developers, unlike [data] fabric,” [data fabric] is the opposite of data mesh, where you’re writing code for the APIs to the interface. On the other hand, data fabric is low-code, no-code, as the API integration takes place inside the fabric.”
The similarities between data meshes and data fabrics
It is important to know in what aspects are these two offerings similar. Both are derived from nearly five decades of data management expertise. Both can be benefitted from each other and make use of the data practices of the other. In many cases, the cost of implementation and maintenance of both frameworks are also similar. The similarities in architecture principles are in business domain basis, data product output, ongoing data discovery, and the graph of data behavior.
Conclusion
While the use cases and the architecture of data fabric and data mesh may vary, they are, at the end of the day, still architecture frameworks and not architectures in themselves. The architecture comes when the needs are properly defined, the data understood, and the processes in the organization accounted for. It is even practical to include the best data fabric and data mesh into the final architecture. It would be prudent to find what of these two architecture frameworks works best for your system.
Yash Mehta is an internationally recognized IoT, M2M and Big Data technology expert. He has written a number of widely acknowledged articles on Data Science, IoT, Business Innovation, Cognitive intelligence. His articles have been featured in the most authoritative publications and awarded as one of the most innovative and influential works in the connected technology industry by IBM and Cisco IoT department. He heads Intellectus (thought-leadership platform for experts) and a Board member in various tech startups.