Chapter 24. Defining and Managing Messages in Log-Centric Architectures
Boris Lublinsky
Messaging systems are changing the way we are exposing data. Instead of focusing on the API between a producer and consumer, we have to focus on the message definitions.
With logs becoming a centerpiece of the architecture, they are starting to take on the role of an enterprise data backbone, somewhat similar to the Hadoop Distributed File System (HDFS) for streaming systems. This architecture encourages the creation of canonical data models, because schema enforcement avoids many problems (e.g., typing errors). This is not a new idea—compare it to the canonical data model used in enterprise application integration (EAI) and service-oriented architecture (SOA), and the concept of standardized service contracts. The rationale is the same in both cases, and this approach is identical to the EAI canonical messaging pattern, ensuring that the content of the log will be understood by all participants.
The canonical data model provides an additional level of decoupling between each service’s individual data format and simplifies data mapping among internal models of different services. If a new service is added to the implementation, only a transformation between the canonical data model and the internal model is required, independent of how this data is represented by other services using it.
Get 97 Things Every Data Engineer Should Know now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.