Chapter 6. Event Schemas

A well-defined schema is essential for any data product. For events, the schema consists of an explicit declaration of the field names, types, defaults, and boundaries, providing clarity into the contents of the data for both human and machine alike. Schemas provide a clear and common understanding of the data for both the data product producer and consumer. Schemas eliminate ambiguity, support both discovery and self-service, and reduce the risk of misunderstanding the data by those who use it.

Schemas simplify data discovery and self-service. You can embed documentation within the schema itself, keeping the data definition and the documentation tightly coupled. Code generators, in conjunction with the schema, can generate classes and objects suitable to the consumer’s programming language of choice. Similarly, event generators can use the schema to generate events that match the definitions, providing a mechanism to generate a wide range of test data for boundary conditions.

Schemas provide a framework for evolving data through time, though your options for schema evolution depend on your technology selection. The main goal of using schema evolution is to update and change data as new business requirements are added and as domains shift and expand, without unduly affecting consumers of the data product.

This chapter is a prescriptive and opinionated look at schemas for your event-driven data mesh. There are many different schema technologies and many ...

Get Building an Event-Driven Data Mesh now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.