Chapter 21. Data Security for Data Engineers

Katharine Jarmul

Is your data safe? What about the data you process every day? How do you know? Can you guarantee it?

These questions aren’t meant to send you running in fear. Instead, I want you to approach security pragmatically. As data engineers, we’re often managing the most valuable company resource. For this reason, it only makes sense that we learn and apply security engineering to our work.

How can we do so? Here are a few tips.

Learn About Security

Most data engineers come from either a computer science or a data science background, so you may not have had exposure to computer and network security concepts. Learn about them by attending security conferences, meetups, and other events. Read up on security best practices for the particular architecture or infrastructure you use. Chat with the IT/DevOps or security folks at your company to hear what measures are already in place. I’m not asking you to become an expert, but I do want you to be informed.

Monitor, Log, and Test Access

Monitor, log, and track access to the machines or containers you use, to the databases or other data repositories you maintain, and to the code and processing systems you contribute to daily. Make sure only credentialed users or machines can access these systems. Create firewall rules (yes, even in the cloud and even with containers) and test them ...

Get 97 Things Every Data Engineer Should Know now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.