Data Mesh is generating both excitement and skepticism at the moment. This aims to promote constructive discussions about Data Mesh adoption by describing situations where Data Mesh may not be the best solution.
Let’s get this party started:
Data Mesh is fundamentally about removing central bottlenecks in data value delivery.
Moving to a completely decentralized model, on the other hand, risks creating data silos and unnecessary duplication of effort, and may necessitate bridging a much larger data analytic and engineering skill gap than can be justified by decentralized teams alone.
As a result, effective decentralization still necessitates centralized coordination to align, enable, and support decentralized data teams.
This coordination overhead may not be justified for small to medium-sized organizations, such as a 100-person company with 2-3 FTEs devoted to data and analytics. Instead, a small and centralized data team may be able to serve analytic needs across the organization more quickly and efficiently.
A 100,000-person organization spread across multiple geographical jurisdictions or legal entities, on the other hand, will inevitably encounter scaling bottlenecks with a single centralized data team. Decentralization will occur naturally in these organizations, and they will most likely already have central business functions tasked with coordination and alignment across distributed teams. In this case, Data Mesh may be an appropriate framework for organizing this coordination and alignment effort.
Data Mesh proposes federating ownership for generating value from data into individual product-oriented teams, borrowing ideas from DevOps, to enable faster value-creation and learning cycles. Each iteration cycle must thus be guided by a clear and rapid feedback loop on the value that each data team is generating for their customers.
Federating ownership is frequently advantageous because the business value can be better defined, prioritized, and iterated on by the business unit with the requirement.
Taking ownership of data analytics and engineering, on the other hand, necessitates an investment that may not be justifiable for some business units without a compelling business case.
As a result, it is unlikely that you will be able to fully realize the Data Mesh operating model from the start. Instead, focus on rapidly proving the value of Data Mesh adoption with a few early adopters over many short iteration cycles, and collaborate with the remaining business units to build the business case for more widely adopting this change.
The core goal of Data Mesh, in my opinion, is to enable your organization to leverage data to more effectively adapt to changes and meet customer needs more quickly. As a result, Data Mesh is a means to an end rather than an end in itself.
Furthermore, there is no canonical Data Mesh reference implementation. As a result, any organization that implements Data Mesh must plan to evolve its Data Platform and Operating Model over time. The best way to structure this evolution process is to experiment and learn the best way forward through many small iteration cycles.
When transitioning to a decentralized operating model, no centralized body can maintain sufficient context or influence to make effective decisions in every distributed domain. Instead, you must rely on distributed teams to find the best solution to the problems at hand on their own.
A culture of blamelessness and psychological safety to experiment, share knowledge, and learn together underpins this mutual trust and individual autonomy.
This article explains how to build, improve, and measure cultural capabilities that lead to improved software delivery and organizational performance.
Adopting Data Mesh necessitates the acquisition of new job titles such as Data Product Owner/Manager, Data Steward, Data Engineer, Data Scientist, or Analytics Engineer. Each plays an important role in enabling the distributed operating model.
These job roles and responsibilities must be standardized to create focus and establish an incentive structure for these roles to collaborate on the Data Mesh. Individuals may choose to prioritize other aspects of their role over performing the necessary activities to build the Data Mesh, such as creating and sharing high-quality Data as Products, in the absence of this clarity and incentivization structure.
Every team will need to become more data-savvy to effectively make data and analytics a ubiquitous part of the business. You may need to make a concerted effort to train your team members in skills such as data analysis, data visualization, SQL, and machine learning to accomplish this.
Because adopting Data Mesh may necessitate a fundamental shift in how a team operates, individuals who can champion new ways of working and assist in enabling and empowering distributed teams are critical in leading the organization toward these changes.
Without a critical mass of data talent and a structured approach to learning and enablement, a distributed organization with high individual autonomy cannot succeed.
High-performing engineering teams deliver more frequently, fail less frequently, and recover from failure faster. They use several best practices to maintain high delivery velocities, such as Continuous Integration and Continuous Delivery (CI/CD). Because automation is required for Data Mesh to deliver value at scale, high-performing engineering teams are more likely to use these DevOps practices in their daily work.
Data teams that rely on manual, ad hoc, and one-time processes are unlikely to produce reliable and trustworthy Data as Products. These Data Product properties are critical for allowing the Data Mesh to scale.
Data Mesh, like DevOps, is about more than just technology and tools. It is a mindset and cultural shift in which teams adopt new working methods.
Only technology can catalyze cultural change. Furthermore, rather than focusing on what technology to use, Data Mesh emphasizes how technology can be used for data integration. As a result, adopting any technology solution alone is unlikely to help you realize the value of Data Mesh.
Data ownership distribution cannot lead to lower security, privacy, and compliance standards. Security, privacy, and compliance must instead become everyone’s responsibility.
According to DevOps Research and Assessment (DORA) research, teams can achieve better results by incorporating security into everyone’s daily work rather than testing for security concerns at the end of the process.
Shifting to the left in terms of security, privacy, and compliance in development processes necessitates collaboration between relevant stakeholders and each embedded data team. As a result, when implementing Data Mesh, it is critical to obtain buy-in from all relevant parties and to involve security, privacy, and compliance stakeholders as early as possible.
Federated Computational Governance is a fundamental Data Mesh principle, with an emphasis on automation and standardization to enable more comprehensive and real-time policy monitoring, detection, and remediation.
Data Governance, like Security and Privacy, must “shift left” to become a part of every data team’s daily work. As a result, Data Governance concerns such as improving Data and Metadata quality must be prioritized in the backlog of every data team. These activities can be raised as tickets directly to the relevant data team’s backlog or automated as tests that every code change must pass before being integrated and deployed to production to embed Data Governance into standard development processes.
To drive automation, standardization, and best practices, you may need to establish specialist engineering teams that can develop tooling/processes and provide expert advice to help distributed teams more easily meet Data Governance policies and standards.
Data Mesh is not a panacea for all of your data management issues. Many organizations may find it difficult, if not impossible, to implement.
Regardless, I believe that many large organizations seeking to scale the impact of data and analytics will eventually adopt some form of Data Mesh; however, not every aspect of Data Mesh will be implemented, at least not immediately. Building a reusable and self-serviceable Data Platform as a Product, for example, is quickly emerging as a best practice for cloud adoption. This necessitates the platform’s implementation using infrastructure-as-code and CI/CD with embedded and continuous controls, laying the groundwork for Federated Computational Governance. Decentralized Data Ownership can be widely adopted once organizations are confident that they can operate the Cloud Data Platform safely and at scale. This may prompt changes to Data Domain boundaries and even the org chart to allow distributed data teams to develop Data as products effectively.
Instead of attempting to implement every element of Data Mesh, I would recommend first focusing on how you can empower your data teams to deliver value to your customers faster and more frequently, and then working backward to identify and adopt specific Data Mesh elements that will help you achieve this goal. To know more details visit our website or talk directly to our experts.