Errata

Data Mesh

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
Printed Page xxxvii
5th paragraph

The objectives of data mash -> The objectives of data mesh

Simon Harrer  Aug 01, 2022 
PDF Page xxxix
8

Change “organizational seams” to organizational teams

From:
“… Naturally, governance needed to follow these organizational seams too.”

To:
“… Naturally, governance needed to follow these organizational teams too.”

Jeniffen Chandrabalan  Jan 27, 2023 
PDF Page 30
3rd para

What I'm not able to understand is - What is the motivation for Data Product Owner to build a product?
The amount of work involved is so much, why should he/she spend their time and effort to build a product for consumers?

In Data Lake, as a corporate citizen, they were pushing the data to the lake and what we did with it was no longer their botheration. Now the pain is brought to their door step, why should they take this extra pain? They have day jobs, I don't think they'll be interested or excited to take over this additional responsibility.

Solution, at least, my opinion:If we have a pod team structure as extension of the Federated Governance group, then the main task of that Governance team will be able to push these SMEs, who will be part of his/her team to build and maintain the product. This will ensure the efficacy and relevance of the product and implementation of standards within the product in a consistent manner.

SREERAMA DHARMAPURI  Dec 13, 2023 
PDF Page 31
3rd para

Dear Sir/Madam:


I've a question for Ms. Dehghani on the following sentence, while I agree with the statement, it pops a question - why should Operational teams take on additional responsibility of maintaining these data products and what are they going to gain out of it?
"Operational teams still perceive their data as a byproduct of running the business, leaving it to someone else, e.g., the data team, to pick it up and recycle it into products." on Page 31
There is a reason for this perception, because it is the truth - Data is a byproduct or end product of operational team's core activity and has only two purposes - record and validate. It is useful in testing whether the algorithm is working fine or not, build use cases and for security, legal and compliance purposes ONLY. For operational teams, data doesn't have any more attraction than the last ride a kid rides in a theme park. Its done and over, its registered in his/her system. That is why we have plethora of archival systems - to prove that once data is generated, we are done with it - most of the time - and it can go to hell (cold storage).

When we ask them to maintain it, that is extra effort and what are they getting out of it? A good corporate citizenry badge? It's too much effort for too little reward. Data has to be in raw format and left untouched and untransformed to be in pristine form so that any known or unknown use cases pop up in future, then transformations can be applied according to the will and wish of the requestor in Self-service platform.

Please let me know what you think of this argument?

SREERAMA DHARMAPURI  Dec 18, 2023 
PDF Page 61
4th para

"Without such a standard, we fall back to a single monolithic (but well-integrated) solution that constrains access to data to a single hosting environment. We fail to share and observe data across environments."
IMO, this argument is faulty and doesn't scale because, establishing a common language like Universal Semantic Layer is going to be a failure and hence the parable of Tower of Babel.

New systems/requirements come up and you can't stop them. We need to keep updating and playing catch up. Instead, what is the unchanging part of this whole setup -it's the data, that happened at a point in time- Data is not stateful, we interpret data as though it includes transaction/transformation also and that is where the problem lies. Data is Data - It is 7:30 PM now - thats it, it is frozen in time and Stateless. When you work at this level and provide complete freedom to any user(s) to transform data the way they want it and create a multiple Data Lakes - it is analogous in evolution from SOA to Microservices - which is more aligned to Data Mesh in spirit that building a centralized architecture. Microservices came from SOA - in the same manner - Data Mesh should be nothing but Micro Datalakes. If you really want to mimic Service Mesh, then
SOA - Centralized Data Lake
Microservices - Micro DataLakes - combination of Pods, there can be 1-to-1 mapping of pods to Microservices also.
Containers - Data as Product
Pods - Combination of Data Products that bring about an insight for a user/consumer.
Orchestrator (Kubernetes)- Self-Serve Platform


Pls. let me know what you think of it.

SREERAMA DHARMAPURI  Dec 21, 2023 
Printed Page 310
1st paragraph

"DataBold Textbalancing local sharing..."

invalid occurrence of "Bold Text" in sentence.

Jochen Christ  Aug 10, 2022 
Printed Page 327
2nd Paragraph

"In short, many different responsibilities that require data specialization fall under the CDOA’s functional organization."

must be ...CDAO’s...

Jochen Christ  Aug 12, 2022